perm filename V2ANS1.TEX[TEX,DEK]7 blob sn#421462 filedate 1979-02-23 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00032 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00005 00002	% Note: page 22 line 209 "to appear" is imminent
C00006 00003	%folio 649 galley 1 (C) Addison-Wesley 1978	*
C00025 00004	%folio 654 galley 2 (C) Addison-Wesley 1978	*
C00038 00005	%folio 659 galley 3 (C) Addison-Wesley 1978	*
C00055 00006	%folio 662 galley 4 (C) Addison-Wesley 1978 	*
C00076 00007	%folio 666 galley 5 (C) Addison-Wesley 1978	*
C00097 00008	%folio 670 galley 6 (C) Addison-Wesley 1978	*
C00110 00009	%folio 674 galley 7 (C) Addison-Wesley 1978	*
C00124 00010	%folio 678 galley 8 (not on paper tape) (C) Addison-Wesley 1978	*
C00143 00011	%folio 687 galley 1 (C) Addison-Wesley 1978	*
C00156 00012	%folio 690 galley 2 (C) Addison-Wesley 1978	*
C00167 00013	%folio 693 galley 3 (C) Addison-Wesley 1978	*
C00187 00014	%folio 701 galley 4 (C) Addison-Wesley 1978	*
C00200 00015	%folio 707 galley 5 (C) Addison-Wesley 1978	*
C00211 00016	%folio 710 galley 6 (C) Addison-Wesley 1978	*
C00226 00017	%folio 715 galley 7 (C) Addison-Wesley 1978	*
C00233 00018	%folio 717 galley 8 (C) Addison-Wesley 1978	*
C00244 00019	%folio 720 galley 9 (C) Addison-Wesley 1978	*
C00255 00020	%folio 722 galley 10a (C) Addison-Wesley 1978	*
C00257 00021	%folio 723 galley 10b (C) Addison-Wesley 1978	*
C00261 00022	%folio 725 galley 1 (C) Addison-Wesley 1978	*
C00280 00023	%folio 731 galley 2a (C) Addison-Wesley 1978	*
C00292 00024	%folio 733 galley 2b (C) Addison-Wesley 1978	*
C00297 00025	%folio 735 galley 3a (C) Addison-Wesley 1978	*
C00303 00026	%folio 738 galley 3b (C) Addison-Wesley 1978	*
C00315 00027	%folio 742 galley 4a (C) Addison-Wesley 1978	*
C00328 00028	%folio 745 galley 4b (C) Addison-Wesley 1978	*
C00337 00029	%folio 748 galley 5 (C) Addison-Wesley 1978	*
C00356 00030	%folio 754 galley 6 (C) Addison-Wesley 1978	*
C00371 00031	%folio 758 galley 7a (C) Addison-Wesley 1978	*
C00382 00032	\eject % assumes that page break at this point works OK, so file v2ans1 not too long
C00387 ENDMK
C⊗;
% Note: page 22 line 209 "to appear" is imminent
\input acphdr % Answer pages (double-check position of figures)
\runninglefthead{ANSWERS TO EXERCISES}
\titlepage\setcount00
\null
\vfill
\tenpoint
\ctrline{ANSWER PAGES for THE ART OF COMPUTER PROGRAMMING}
\ctrline{(Volume 2)}
\ctrline{(first half of the answers)}
\ctrline{$\copyright$ 1978 Addison--Wesley Publishing Company, Inc.}
\vfill
\eject
\setcount0 496
%folio 649 galley 1 (C) Addison-Wesley 1978	*
\titlepage
\runninglefthead{ANSWERS TO EXERCISES}
\corners
\vskip 1in plus 30pt minus 10pt
\rjustline{\:;ANSWERS TO EXERCISES}
\vskip 1cm plus 30pt minus 10pt
\quoteformat{This branch of mathematics is the only one, I believe,\cr
in which good writers frequently get results entirely erroneous.\cr
. . . It may be doubted if there is a single\cr
extensive treatise on probabilities in existence\cr
which does not contain solutions absolutely indefensible.\cr}
\author{C. S. PEIRCE, in {\sl Popular Science Monthly} (1878)}
\vskip 1cm plus 30pt minus 10pt
\sectionskip\exbegin{NOTES ON THE EXERCISES}
\ansno 1. An average problem for a mathematically-inclined reader.

\ansno 3. See W. J. LeVeque, {\sl Topics in Number Theory \bf 2}
(Reading, Mass.: Addison-Wesley, 1956), Chapter 3.\xskip ({\sl Note:} One
of the men who read a preliminary draft of the manuscript of
this book reported that he had discovered a truly remarkable
proof, which the margin of his copy was too small to contain.)

\ansbegin{3.1}

\ansno 1. (a)\9This will usually fail, since ``round''
telephone numbers are often selected by the telephone user when
possible. In some communities, telephone numbers are perhaps assigned
randomly. But it is a mistake in any case to try to get several
successive random numbers from the same page, since the same telephone
number is often listed several times in a row.

(b)\9But do you use the left-hand page or the right-hand
page? Say, use the left-hand page number, divide by 2, and take
the units digit. The total number of
pages should be a multiple of 20; but even so, this method will have some bias.

\eject % This gets part of Section 3.1 on the first page of the answers
(c)\9The markings on the faces will slightly bias
the die, but for practical purposes this method is quite satisfactory
(and it has been used by the author in the preparation of several
examples in this set of books). See {\sl Math.\ Comp.\ \bf 15} (1961),
94--95 for further discussion of these dice.

(d)\9(This is a hard question thrown in purposely
as a surprise.) The number is not random; if the average number
of emissions per minute is $m$, the probability that the counter
registers $k$ is $e↑{-m}m↑k/k!$ (the Poisson distribution). So
the digit 0 is selected with probability $e↑{-m} \sum ↓{k≥0}
m↑{10k}/(10k)!$, etc. The units digit will be even with probability
$e↑{-m}\mathop{\hbox{cosh}}m = {1\over2} + {1\over2}e↑{-2m}$, and
this is never equal to ${1}\over{2}$
(although the error is negligibly small when $m$ is large).

(e)\9Okay, provided that the time since the previous
digit selected in this way is random. However, there is possible
bias in borderline cases.

(f,g)\9No, people usually think of certain digits
(like 7) with higher probability.

(h)\9Okay; your assignment of numbers to the horses
had probability $1\over10$ of assigning a given digit to
the winning horse.

\ansno 2. The number of such sequences is the multinomial
coefficient $1000000!/(100000!)↑{10}$; the probability is this
number divided by $10↑{1000000}$, the total number of sequences of a million
digits. By Stirling's approximation the
probability is close to $1/(16π↑410↑{22}\sqrt{2π}) \approx 2.55
\times 10↑{-26}$, about one chance in $4 \times 10↑{25}$.

\ansno 3. 3040504030.

\ansno 4. Step K11 can be entered only from step K10 or step
K2, and in either case we find it impossible for $X$ to be zero
by a simple argument. If $X$ could be zero at that point, the
algorithm would not terminate.

\ansno 5. Since only $10↑{10}$ ten-digit numbers are possible,
some value of $X$ must be repeated during the first $10↑{10} + 1$
steps; and as soon as a value is repeated, the sequence continues
to repeat its past behavior.

\ansno 6. (a)\9Arguing as in the previous exercise, the sequence
must eventually repeat a value; let this repetition occur for
the first time at step $\mu + λ$, where $X↓{\mu+λ} = X↓\mu$.\xskip
(This condition defines $\mu$ and $λ$.) We have $0 ≤ \mu < m,\ 0
< λ ≤ m,\ \mu + λ ≤ m$. The values $\mu = 0, λ = m$ are attained
iff $f$ is a cyclic permutation; $\mu = m - 1, λ = 1$ occurs, e.g.,
if $X↓0 = 0,\ f(x) = x + 1$ for $x < m - 1$, and $f(m - 1) = m -
1$.

(b)\9We have, for $r ≥ n$, $X↓r = X↓n$ iff $r - n$ is
a multiple of $λ$ and $n ≥ \mu$. Hence $X↓{2n} = X↓n$ iff $n$ is
a multiple of $λ$ and $n ≥ \mu$. The desired results now follow
immediately.\xskip (Note: This is essentially a proof of the familiar
mathematical result that the powers of an element in a finite
semigroup include a unique idempotent element: take $X↓0 = a,\
f(x) = ax$.)

\ansno 7. (a)\9The least $n>0$ such that $n-(\lscr(n)-1)$ is a multiple of $λ$
and $\lscr(n)-1≥\mu$ is $n=2↑{\lceil \lg \max(\mu+1,λ)\rceil}-1+λ$.\xskip [This
may be compared with the least $n>0$ such that $X↓{2n}=X↓n$, namely
$λ\delta↓{\mu0}+\mu+λ-1-\left((\mu+λ-1)\mod λ\right)$.]

(b)\9Start with $X=Y=X↓0,\ k=m=1$.\xskip
$\biglp$At key places in this algorithm we will have
$X=X↓{2m-k-1}$, $Y=X↓{m-1}$, and $m=\lscr(2m-k)$.$\bigrp$\xskip To generate the
next random number, do the following steps: Set $X←f(X)$ and $k←k+1$. If
$X=Y$, stop (the period length $λ$ is equal to $m-k$). Otherwise if $k=0$, set
$Y←X,\ m←2m,\ k←m$. Output $X$.

{\sl Notes:} Brent has also
considered a more general method in which the successive values
of $Y=X↓{n↓i}$ satisfy $n↓1=0,\ n↓{i+1}=1+\lfloor pn↓i\rfloor$ where $p$ is
any number greater than 1. He showed that the best choice of $p$, approximately
2.4771, saves about 3 percent of the iterations by comparison with $p=2$.\xskip (See
exercise 4.5.4-4.)

The method in part (b) has a serious deficiency, however, since it might generate a
lot of nonrandom numbers before shutting off. For example, we might have a
particularly bad case such as $λ=1,\ \mu=2↑k$. A method based on Floyd's idea
in exercise 6(b), namely one that maintains $Y=X↓{2n}$ and $X=X↓n$ for
$n = 0,1,2,\ldotss$, will
stop before any number has been output twice. On the other hand if
$f$ is unknown (e.g., if we are receiving the
values $X↓0, X↓1, \ldots$ on-line from an outside source) or
if $f$ is difficult to apply, the following cycle detection algorithm
due to R. W. Gosper will be preferable: Maintain an auxiliary
table $T↓0, T↓1, \ldotss , T↓m$, where $m = \lfloor \lg n\rfloor$
when receiving $X↓n$. Initially $T↓0 ← X↓0$. For $n = 1, 2, \ldotss$,
compare $X↓n$ with each of $T↓0, \ldotss , T↓{\lfloor\lg n\rfloor}$; if
no match is found, set $T↓{e(n)} ← X↓n$, where $e(n) = \max\leftset e\relv
2↑e \hbox{ divides }n+1\rightset$. But if a match $X↓n =
T↓k$ is found, then $λ = n - \max\leftset m \relv  m < n \hbox{ and } e(m) =
k\rightset$.
This procedure does not stop before generating a number twice, but at most $\lceil
{2\over3}λ\rceil$ of the $X$'s will have occurred more than once.

\ansno 8. (a,b)\9$00, 00,\ldots$ [62 starting values]; $10, 10,\ldots$
[19]; $60, 60,\ldots$ [15]; $50, 50,\ldots$ [1]; $24, 57, 24,
57,\ldots$ [3].\xskip (c) 42 or 69; these both lead to a set of fifteen
distinct values, namely (42 or 69), 76, 77, 92, 46, 11, 12,
14, 19, 36, 29, 84, 05, 02, 00.

\ansno 9. Since $X < b↑n$, we have $X↑2 < b↑{2n}$, and the middle square
is $\lfloor X↑2/b↑n\rfloor ≤ X↑2/b↑n$. If $X > 0$, then $X↑2/b↑n <
Xb↑n/b↑n = X$.

\ansno 10. If $X = ab↑n$, the next number of the sequence has
the same form; it is $(a↑2\mod b↑n)b↑n$. If $a$ is a multiple
of $b$ or of all the prime factors of $b$, the sequence will soon
degenerate to zero; if not, the sequence will degenerate into
a cycle of numbers having the same general form as $X$.

Further facts about the middle-square method have
been found by B. Jansson, {\sl Random Number Generators} (Stockholm:
Almqvist and Wiksell, 1966), Section 3A. Numerologists will
be interested to learn that the number 3792 is self-reproducing
in the four-digit middle-square method, since $3792↑2 = 14379264$;
furthermore (as Jansson has observed), it is ``self-reproducing''
in another sense too, since its prime factorization is $3\cdot 79 \cdot
2↑4$!

\ansno 11. The probability that $λ = 1$ and $\mu = 0$ is the probability
that $X↓1 = X↓0$, namely $1/m$. The probability that $λ = 1, \mu
= 1$, or $λ = 2, \mu = 0$, is the probability that $X↓1 ≠ X↓0$
and that $X↓2$ has a certain value, so it is $(1 - 1/m)(1/m)$.
Similarly, the probability that the sequence has any given $\mu$
and $λ$ is a function only of $\mu + λ$, namely
$$P(\mu , λ) = {1\over m}\prod ↓{1≤k<\mu +λ}\left(1-{k\over m}\right).$$

For the probability that $λ = 1$, we have
$$\sum ↓{\mu ≥0}{1\over m}\prod↓{1≤k≤\mu}\left(1-{k\over m}\right)
={1\over m}Q(m),$$
where $Q(m)$ is defined in Section 1.2.11.3, Eq.\
(2). By Eq.\ (25) in that section, the probability is approximately
$6\sqrt{π/2m} \approx 1.25/\sqrt{m}$. The chance of Algorithm K converging
as it did is only about one in 80000; the author was decidedly
unlucky. But see exercise 15 for further comments on the ``colossalness.''

%\ansno 12. This answer begins with an aligned equation.
\penalty-200\vskip 12pt plus 6pt minus 6pt
\halign{\hbox to 19pt{\hfill#}⊗\quad\hfill$\dispstyle#$⊗$\dispstyle\;#$\hfill\cr
{\bf 12.}⊗\sum↓{\scriptstyle 1≤λ≤m \atop \scriptstyle 0≤\mu<m}λP(\mu,λ)⊗
={1\over m}\left(1+3\left(1-{1\over m}\right)+6\left(1-{1\over m}\right)
\left(1-{2\over m}\right)+\cdots\right)\cr
⊗⊗={1+Q(m) \over 2}.\cr}
\vskip 9 pt plus 6pt minus 6pt
\noindent
(See the previous answer. In general if $f(a↓0, a↓1,\ldotss) = \sum↓{n≥0}a↓n
\prod ↓{1≤k≤n}(1 - k/m)$ then $f(a↓0,
a↓1,\ldotss) = a↓0 + f(a↓1, a↓2,\ldotss) - f(a↓1, 2a↓2,\ldotss)/m$;
apply this identity with $a↓n = (n + 1)/2$.) Therefore the average
value of $λ$ (and, by symmetry of $P(\mu , λ)$, also of $\mu + 1$)
is approximately $\sqrt{πm/8}+ {1\over 3}$. The average value
of $\mu + λ$ is exactly $Q(m)$, approximately $\sqrt{πm/2} - {1\over 3}$.\xskip
[For alternative derivations and further results, including asymptotic
values for the moments, see A. Rapoport, {\sl Bull.\ Math.\ Biophysics
\bf 10} (1948), 145--157, and B. Harris, {\sl Annals Math.\ Stat.\ 
\bf 31} (1960), 1045--1062; see also I. M. Sobol, {\sl Theory of
Probability and its Applications \bf 9} (1964), 333--338. Sobol
discusses the asymptotic period length for the more general
sequence $X↓{n+1} = f(X↓n)$ if $n \neqv 0 \modulo{m};\ X↓{n+1}
= g(X↓n)$ if $n ≡ 0 \modulo{m}$; with both $f$ and $g$ random.]

\ansno 13. [Paul Purdom and John Williams, {\sl Trans.\ Amer.\ Math.\ 
Soc.\ \bf 133} (1968), 547--551.]\xskip Let $T↓{mn}$ be the number
of functions that have $n$ one-cycles and no cycles of length
greater than one. Then$$
T↓{mn}={m-1\choose n-1}m↑{m-n}.$$
(See Section 2.3.4.4.) {\sl Any} function is such a function followed by a
permutation of the $n$ elements that were the one-cycles. Hence $\sum↓{n≥1}
T↓{mn}\,n! = m↑m$.

Let $P↓{nk}$ be the number of permutations of $n$ elements in which the
longest cycle is of length $k$. Then the number of functions with a maximum
cycle of length $k$ is $\sum↓{n≥1}T↓{mn}P↓{nk}$. To get the average value of
$k$, we compute $\sum↓{k≥1}\sum↓{n≥1}kT↓{mn}P↓{nk}$, which by the result of
exercise 1.3.3-23 is $\sum↓{n≥1}T↓{mn}\,n!(n+{1\over 2}+ε↓n)c$
where $c \approx .62433$ and
$ε↓n→0$ as $n→∞$. Summing, we get the average value $cQ(m)+{1\over 2}c+\delta↓m$,
where
$\delta↓m→0$ as $m→∞$.\xskip (This is not substantially larger than the average
value when $X↓0$ is selected at random. The average value of $\max(\mu+λ)$ is
still unknown.)
%folio 654 galley 2 (C) Addison-Wesley 1978	*
\ansno 14.  Let $c↓r(m)$ be the number of
functions with exactly $r$ different final cycles. From the
recurrence $c↓1(m) = (m - 1)! - \sum ↓{k>0}{m\choose k}(-1)↑k(m
- k)↑kc↓1(m - k)$, which comes by counting the number of functions
whose image contains at most $m - k$ elements, we find the solution
$c↓1(m) = m↑{m-1}Q(m)$.\xskip (Cf.\ exercise 1.2.11.3--16.)\xskip Another
way to obtain the value of $c↓1(m)$, which is perhaps more elegant
and revealing, is given in exercise 2.3.4.4--17. The value of
$c↓r(m)$ may be determined by solving the recurrence
$$c↓r(m) = \sum ↓{0<k<m}{m-1\choose k-1}c↓1(k)c↓{r-1}(m-k),$$
which has the solution
$$c↓r(m) = m↑{m-1}\left({1\over 0!}{1\comb[]r}
+ {1\over 1!}{2\comb[]r}{m - 1\over m} + {1\over
2!}{3\comb[]r} {m - 1\over m} {m - 2\over m} + \cdots\right).$$
The desired average value can now be computed;
it is (cf.\ exercise 12)
$$\eqalign{E↓m⊗= {1\over m} \left(H↓1 + 2H↓2 {m
- 1\over m} + 3H↓3 {m - 1\over m} {m - 2\over m} + \cdots\right)\cr
⊗= 1 + {1\over 2} {m - 1\over m} + {1\over 3} {m - 1\over m}
{m - 2\over m} + \cdotss.\cr}$$
This latter formula was obtained by quite different
means by Martin D. Kruskal, {\sl AMM \bf 61} (1954), 392--397.
Using the integral representation
$$E↓m = \int ↑{∞}↓{0}\left(\left(
1 + {x\over m}\right)↑m - 1\right) e↑{-x}\, {dx\over x},$$
he proved the asymptotic relation
$\lim↓{m→∞} (E↓m - {1\over 2}\ln m) = {1\over 2}(\gamma
+\ln 2)$.
For further results and references, see John Riordan,
{\sl Annals Math.\ Stat.\ \bf 33} (1962), 178--185.

\ansno 15.  The probability that $f(x) ≠ x$ for all $x$
is ($m - 1)↑m/m$, which is approximately $1/e$. The existence
of a self-repeating value in an algorithm like Algorithm K is
therefore not ``colossal'' at all---it occurs with probability
$1 - 1/e \approx .63212$. The only ``colossal'' thing was that
the author happened to hit such a value when $X↓0$ was chosen
at random (see exercise 11).

\ansno 16.  The sequence will repeat when a pair of successive
elements occurs for the second time. The maximum period is $m↑2$.\xskip
(Cf.\ next exercise.)

\ansno 17.  After selecting $X↓0, \ldotss , X↓{k-1}$ arbitrarily,
let $X↓{n+1} = f(X↓n, \ldotss , X↓{n-k+1})$, where $0 ≤ x↓1,
\ldotss , x↓k < m$ implies that $0 ≤f(x↓1, \ldotss , x↓k) < m$.
The maximum period is $m↑k$. This is an obvious upper bound,
but it is not obvious that it can be attained; for a proof that
it can always be attained for suitable $f$, see exercises 3.2.2--17
and 3.2.2--21, and for the number of ways to attain it see exercise
2.3.4.2--23.

\ansno 18. Same as exercise 7, but use the $k$-tuple of elements
$(X↓n,\ldotss,X↓{n-k+1})$
in place of $X↓n$.

\ansbegin{3.2.1}

\ansno 1. Take $X↓0$ even, $a$
even, $c$ odd. Then $X↓n$ is odd for $n > 0$.

\ansno 2.  Let $X↓r$ be the first repeated value in the
sequence. If $X↓r = X↓k$ for $0 < k < r$, we could prove that
$X↓{r-1} = X↓{k-1}$, since $X↓n$ uniquely determines $X↓{n-1}$
when $a$ is prime to $m$. Hence, $k = 0$.

\ansno 3. If $d$ is the greatest common
divisor of $a$ and $m$, the quantity $aX↓n$ can take on at most
$m/d$ values. The situati≠n can be even worse; e.g., if $m =
2↑e$ and if $a$ is even, Eq.\ (6) shows that the sequence is
eventually constant.

\ansno 4.  Induction on $k$.

\ansno 5.  If $a$ is relatively prime to $m$, there is
a number $a↑\prime$ for which $aa↑\prime ≡ 1 \modulo{m}$. Then
$X↓{n-1} = (a↑\prime X↓n - a↑\prime c)\mod m$, and in general,
$$\eqalign{X↓{n-k}⊗=\left( (a↑\prime )↑kX↓n - c(a↑\prime + \cdots
+ (a↑\prime )↑k)\right)\mod m\cr ⊗= \left((a↑\prime
)↑kX↓n - c {(a↑\prime )↑{k+1} - a↑\prime \over a↑\prime - 1}\right)\mod
m\cr}$$
when $k > 0,\ n - k ≥ 0$. If $a$ is not relatively prime
to $m$, it is not possible to determine $X↓{n-1}$ when $X↓n$
is given; multiples of $m/\gcd(a, m)$ may be added to
$X↓{n-1}$ without changing the value of $X↓n$.\xskip (See also exercise
3.2.1.3--7.)

\ansbegin{3.2.1.1}

\ansno 1. Let $c↑\prime$ be a
solution to the congruence $ac↑\prime ≡ c \modulo{m}$.\xskip (Thus,
$c↑\prime = a↑\prime c\mod m$, if $a↑\prime$ is the number in
the answer to exercise 3.2.1--5.)\xskip Then we have
$$\vcenter{\mixtwo{LDA⊗X\cr
ADD⊗CPRIME\cr
MUL⊗A⊗\blackslug\cr}}$$
Overflow is possible on this addition operation.\xskip (From results derived later
in the chapter, it is probably best to take $c=a$ and to replace the {\:t ADD}
instruction by ``{\:t INCA 1}''.  Then if $X↓0=0$, overflow will not occur
until the end of the period, so it won't occur in practice.)

\mixans 2. {⊗RANDM⊗STJ⊗1F\cr
⊗⊗LDA⊗XRAND\cr
⊗⊗MUL⊗A\cr
⊗⊗SLAX⊗5\cr
⊗⊗ADD⊗C⊗(or, \.{INCA} $c$, if $c$ is small)\cr
⊗⊗STA⊗XRAND\cr
⊗1H⊗JNOV⊗*\cr
⊗⊗JMP⊗*-1\cr
⊗XRAND⊗CON⊗X\cr
⊗A⊗CON⊗$a$\cr
⊗C⊗CON⊗$c$⊗\blackslug\cr}
\yyskip
\noindent{\sl Note:} Locations \.A and \.C should probably be named
\.{2H} and \.{3H} to avoid conflict with other symbols, if this subroutine
is to be used by other programmers.

\ansno 3.  See \.{WM1} at the end of Program 4.2.3D.

\ansno 4.  \def\\{\<\,\mathbin{\underline{\char'155\char'157\char'144}}\<\,}
Define the operation $x\\2↑e = y$ iff
$x ≡ y\modulo{2↑e}$ and $-2↑{e-1} ≤ y < 2↑{e-1}$. The congruential
sequence $\langle Y↓n\rangle$ defined by
$$Y↓0 = X↓0\\2↑{32},\qquad Y↓{n+1} = (aY↓n + c)\\2↑{32}$$
is easy to compute on 370-style machines, since
the lower half of the product of $y$ and $z$ is $(yz)\\2↑{32}$
for all two's complement numbers $y$ and $z$, and since addition
ignoring overflow also delivers its result $\!\null\\ 2↑{32}$. This
sequence has all the randomness properties of the standard linear
congruential sequence $\langle X↓n\rangle $, since $Y↓n ≡ X↓n\modulo
{2↑{32}}$. Indeed, the two's complement representation of $Y↓n$
is {\sl identical} to the binary representation of $X↓n$, for
all $n$.\xskip [G. Marsaglia and T. A. Bray first pointed this
out in {\sl CACM \bf 11} (1968), 757--759.]

\ansno 5.  (a) Subtraction: \.{LDA X;} \.{SUB Y;} \.{JANN *+2;}
\.{ADD M}.\xskip (b) Addition: \.{LDA X;} \.{SUB M;} \.{ADD Y;} \.{JANN
*+2;} \.{ADD M}.\xskip (Note that if $m$ is near the word size, the instruction
``.{SUB M}'' must precede the instruction ``\.{ADD Y}''.)

\ansno 6.  The sequences are not essentially different,
since adding the constant $(m - c)$ has the same effect as subtracting
the constant $c$. The operation must be combined with
multiplication, so a subtractive process has little merit over
the additive one (at least in \MIX's case), except when it is
necessary to avoid affecting the overflow toggle.

\ansno 7.  The prime factors of $z↑k - 1$ appear in the
factorization of $z↑{kr} - 1$. If $r$ is odd, the prime factors
of $z↑k + 1$ appear in the factorization of $z↑{kr} + 1$. And
$z↑{2k} - 1$ equals
$(z↑k - 1)(z↑k + 1)$.

%\ansno 8. appears within the following alignment.
\anskip
\halign{\hbox to 19pt{#}⊗\lft{\tt#}\quad⊗\lft{\tt#}\quad⊗
\lft{\rm#}\cr
{\hfill\bf 8. }⊗JOV⊗*+1
⊗(Ensure that overflow is off.)\cr
⊗LDA⊗X\cr
⊗MUL⊗A\cr
⊗STX⊗TEMP\cr
⊗ADD⊗TEMP⊗Add lower half to upper half.\cr
⊗JNOV⊗*+2⊗If $≥w$, subtract $w - 1$.\cr
⊗INCA⊗1⊗(Overflow is impossible in this step.)\quad\blackslug\cr}
\yyskip
\noindent{\sl Note:} Since addition on an $e$-bit
ones'-complement computer is $\mod\;(2↑e - 1)$, it is possible
to combine the techniques of exercises 4 and 8, producing $(yz)\mod(2↑e
- 1)$ by adding together the two $e$-bit halves of the product
$yz$, for all ones' complement numbers $y$ and $z$ regardless
of sign.

\ansno 9.  The pairs $(0, w - 2)$, $(1, w - 1)$ are treated
as equivalent in input and output:

\yyskip
\halign{\hbox to 19pt{#}⊗\lft{\tt#}\quad⊗\lft{\tt#}\quad⊗
\lft{\rm#}\cr
⊗JOV⊗*+1\cr
⊗LDA⊗X\cr
⊗MUL⊗A⊗$aX = qw + r$\cr
⊗SLC⊗5⊗$\rA ← r,\ \rX ← q$.\cr
⊗STX⊗TEMP\cr
⊗ADD⊗TEMP\cr
⊗JNOV⊗*+2⊗Get $(r + q)\mod(w - 2)$.\cr
⊗INCA⊗2⊗Overflow is impossible.\cr
⊗ADD⊗TEMP\cr
⊗JNOV⊗*+3⊗Get $(r + 2q)\mod(w - 2)$.\cr
⊗INCA⊗2⊗Overflow is possible.\cr
⊗JOV⊗*-1⊗\hskip4.5cm\blackslug\cr}
%folio 659 galley 3 (C) Addison-Wesley 1978	*
\ansbegin{3.2.1.2}

\ansno 1. Period length $m$,
by Theorem A\null.\xskip (Cf.\ exercise 3.)

\ansno 2.  Yes, these conditions imply the conditions
in Theorem A\null, since the only prime divisor of $2↑e$ is 2, and
any odd number is relatively prime to $2↑e$.\xskip (In fact, the conditions
of the exercise are {\sl necessary} and sufficient.)

\ansno 3.  By Theorem A\null, we need $a ≡ 1 \modulo{4}$ and
$a ≡ 1\modulo{5}$. By Law D of Section 1.2.4, this is equivalent
to $a ≡ 1\modulo{20}$.

\ansno 4.  We know $X↓{2↑{e-1}} ≡ 0
\modulo{2↑{e-1}}$ by using Theorem A in the case $m = 2↑{e-1}$.
Also using Theorem A for $m = 2↑e$, we know that $X↓{2↑{e-1}}
\neqv 0\modulo{2↑e}$. It follows
that $X↓{2↑{e-1}} = 2↑{e-1}$. More generally,
we can use Eq. 3.2.1--6 to prove that the second half of the
period is essentially like the first half, since $X↓{n+2↑{e-1}}
= (X↓n + 2↑{e-1})\mod 2↑e$.\xskip (The quarters
are similar too, see exercise 21.)

\ansno 5.  We need $a ≡ 1 \modulo{p}$ for $p = 3, 11,
43, 281, 86171$. By Law D of Section 1.2.4, this is equivalent
to $a ≡ 1 \modulo{3 \cdot 11 \cdot 43 \cdot 281 \cdot 86171}$, so the {\sl
only} solution is the terrible multiplier $a = 1$.

\ansno 6.  (Cf.\ previous exercise.) $a ≡ 1 \modulo {3\cdot
7 \cdot 11 \cdot 13 \cdot 37}$ implies that the solutions are $a =
1 + 111111k$, for $0 ≤ k ≤ 8$.

\ansno 7.  Using the notation of the proof of Lemma Q\null,
$\mu$ is the smallest value such that $X↓{\mu+λ} = X↓\mu$;
so it is the smallest value such that $Y↓{\mu+λ} = Y↓\mu$
and $Z↓{\mu+λ} = Z↓\mu$. This shows that $\mu = \max(\mu
↓1, \ldotss , \mu ↓t)$. The highest achievable $\mu$ is $\max(e↓1,
\ldotss , e↓t)$, but nobody really wants to achieve it.

\ansno 8.  $a↑2 ≡ 1 \modulo{8}$; so $a↑4 ≡ 1 \modulo
{16}$, $a↑8 ≡ 1 \modulo{32}$, etc. If $a \mod 4 = 3$, then $a - 1$ is
twice an odd number; thus, we have $(a↑{2↑{e-1}} - 1)/2 ≡ 0 \modulo
{2↑{e+1}/2}$, and this yields the desired result.

\ansno 9.  Substitute for $X↓n$ in terms of $Y↓n$ and
simplify. If $X↓0$ mod 4 = 3, the formulas of the exercise do
not apply; but they do apply to the sequence $Z↓n = (-X↓n)\mod
2↑e$, which has essentially the same behavior.

\ansno 10.  Only $m = 1$, 2, 4, $p↑e$, and $2p↑e$, for odd
primes $p$. In all other cases, the result of Theorem B is an
improvement over Euler's theorem (exercise 1.2.4--28).

\ansno 11.  (a) Either $x + 1$ or $x - 1$ (not both) will
be a multiple of 4, so $x \mp 1 = q2↑f$, where $q$ is odd and
$f$ is greater than 1.\xskip (b) In the given circumstances, $f <
e$ and so $e ≥ 3$. We have $\pm x ≡ 1 \modulo{2↑f}$ and $\pm
x \neqv 1 \modulo{2↑{f+1}}$ and $f > 1$. Hence by applying
Lemma P\null, we find that $(\pm x)↑{2e-f-1} \neqv 1 \modulo
{2↑e}$, while $x↑{2e-f} = (\pm x)↑{2e-f} ≡ 1 \modulo {2↑e}$.
So the order is a divisor of $2↑{e-f}$, but not a divisor of
$2↑{e-f-1}$.\xskip (c) 1 has order 1; $2↑e - 1$ has order 2; the maximum
period when $e ≥ 3$ is therefore $2↑{e-2}$, and for $e≥4$ it is necessary to
have $f = 2$, that is, $x ≡ 4 \pm 1 \modulo {8}$.

\ansno 12.  If $k$ is a proper divisor of $p - 1$ and
if $a↑k ≡ 1 \modulo{p}$, then by Lemma P $a↑{kp↑{e-1}}
≡ 1 \modulo {p↑e}$. Similarly, if $a↑{p-1}
≡ 1 \modulo {p↑2}$, we find that $a↑{(p-1)p↑{e-2}}
≡ 1 \modulo {p↑e}$. So in these cases $a$ is {\sl
not} primitive. Conversely, if $a↑{p-1} \neqv 1 \modulo
{p↑2}$, Theorem 1.2.4F and Lemma P tell us that $a↑{(p-1)p↑{e-2}}
\neqv 1 \modulo {p↑e}$, but $a↑{(p-1)p↑{e-1}}
≡ 1 \modulo {p↑e}$. So the order is a divisor
of $(p - 1)p↑{e-1}$ but not of $(p - 1)p↑{e-2}$; it therefore
has the form $kp↑{e-1}$, where $k$ divides $p - 1$. But if $a$
is primitive modulo $p$, the congruence $a↑{kp↑{e-1}}
≡ a↑k ≡ 1 \modulo {p}$ implies that $k = p - 1$.

\ansno 13.  Let $λ$ be the order of $a$ modulo $p$. By
Theorem 1.2.4F\null, $λ$ is a divisor of $p - 1$. If $λ < p - 1$, then
$(p - 1)/λ$ has a prime factor, $q$.

\ansno 14.  Let $0 < k < p$. If $a↑{p-1} ≡ 1 \modulo
{p↑2}$, then $(a + kp)↑{p-1} ≡ a↑{p-1} + (p - 1)a↑{p-2}kp \modulo
{p↑2}$; and this is $\neqv 1$, since $(p - 1)a↑{p-2}k$ is
not a multiple of $p$. By exercise 12, $a + kp$ is primitive
modulo $p↑e$.

\ansno 15. (a)\9 If $λ↓1 = p↑{e↓1}↓{1} \ldotss p↑{e↓t}↓{t},
λ↓2 = p↑{f↓1}↓{1} \ldotss p↑{f↓t}↓{t}$, let $\kappa ↓1 = p↑{g↓1}↓{1}
\ldotss p↑{g↓t}↓{t}, \kappa ↓2 = p↑{h↓1}↓{1} \ldotss p↑{h↓t}↓{t}$,
where$$\vbox{
\halign{$#$⊗\lft{$\;#$}⊗\quad#\quad⊗$#$⊗\lft{$\;#$}⊗\qquad if\quad$#$\cr
g↓j⊗=e↓j⊗and⊗h↓j⊗=0,⊗e↓j<f↓j,\cr
g↓j⊗=0⊗and⊗h↓j⊗=f↓j,⊗e↓j≥f↓j.\cr}}$$
Now $a↑{\kappa ↓1}↓{1}, a↑{\kappa ↓2}↓{2}$ have periods
$λ↓1/\kappa ↓1, λ↓2/\kappa ↓2$, and the latter are relatively
prime. Furthermore $(λ↓1/\kappa ↓1)(λ↓2/\kappa ↓2) = λ$, so it suffices
to consider the case when $λ↓1$ is relatively prime to $λ↓2$,
that is, when $λ = λ↓1λ↓2$. Now since $(a↓1a↓2)↑λ≡1$, we
have $1 ≡ (a↓1a↓2)↑{λλ↓1} ≡ a↑{λλ↓1}↓{2}$; hence $λλ↓1$
is a multiple of $λ↓2$. This implies that $λ$ is a multiple
of $λ↓2$, since $λ↓1$ is relatively prime to $λ↓2$. Similarly,
$λ$ is a multiple of $λ↓1$; hence $λ$ is a multiple of $λ↓1λ↓2$.
But obviously $(a↓1a↓2)↑{λ↓1λ↓2} ≡ 1$, so $λ
= λ↓1λ↓2$.

(b)\9 If $a↓1$ has order $λ(m)$ and if $a↓2$
has order $λ$, by part (a) $λ(m)$ must be a multiple of $λ$,
otherwise we could find an element of higher order, namely
of order lcm$\left(λ, λ(m)\right)$.

\ansno 16. (a)\9 $f(x) = (x
- a)\biglp x↑{n-1} + (a + c↓1)x↑{n-2} + \cdots + (a↑{n-1}
+ \cdots + c↓{n-1})\bigrp + f(a)$.\xskip (b) The statement is clear
when $n = 0$. If $a$ is one root,
$f(x) ≡ (x - a)q(x)$;
therefore if $a↑\prime$ is any other root,
$$0 ≡ f(a↑\prime ) ≡ (a↑\prime - a)q(a↑\prime ),$$
and since $a↑\prime - a$ is not a multiple of $p$,
$a↑\prime$ must be a root of $q(x)$. So if $f(x)$ has more than
$n$ distinct roots, $q(x)$ has more than $n - 1$ distinct roots.\xskip
(c) $λ(p) ≥ p - 1$, since $f(x)$ must have degree $≥p - 1$ in
order to possess so many roots. But by Theorem 1.2.4F\null, $λ(p)
≤ p - 1$.

\ansno 17.  By Lemma P\null, $11↑5 ≡ 1 \modulo{25}$, $11↑5 \neqv
1 \modulo{125}$, etc.; so the order of 11 is $5↑{e-1} \modulo
{5↑e}$, not the maximum value $λ(5↑e) = 4 \cdot 5↑{e-1}$. But by
Lemma Q the total period length is the least common multiple
of the period modulo $2↑e$ (namely $2↑{e-2}$) and the period
modulo $5↑e$ (namely $5↑{e-1}$), and this is $2↑{e-2}5↑{e-1}
= λ(10↑e)$. The period modulo $5↑e$ may be $5↑{e-1}$ or $2\cdot 5↑{e-1}$
or $4\cdot 5↑{e-1}$, without affecting the length of period modulo
$10↑e$, since the least common multiple is taken. The values
that are primitive modulo $5↑e$ are those congruent to 2, 3,
8, 12, 13, 17, 22, 23 modulo 25 (cf.\ exercise 12), namely 3,
13, 27, 37, 53, 67, 77, 83, 117, 123, 133, 147, 163, 173, 187,
197.

\ansno 18.  According to Theorem C\null, $a \mod 8$ must
be 3 or 5. Knowing the period of $a$ modulo 5 and modulo 25
allows us to apply Lemma P to determine admissible values of
$a$ mod 25. Period $= 4 \cdot 5↑{e-1}$: 2, 3, 8, 12, 13, 17, 22,
23; period $= 2 \cdot 5↑{e-1}$: 4, 9, 14, 19; period $= 5↑{e-1}$:
6, 11, 16, 21. Each of these 16 values yields one value of
$a$, $0 ≤ a < 200$, with $a \mod 8 = 3$, and another value of $a$
with $a \mod 8 = 5$.

\ansno 19. One example appears in Table 3.3.4--1, line 26.

\ansno 20.  (a) We have $AY↓n + X↓0 ≡ AY↓{n+k} + X↓0 \modulo
{m}$ iff $Y↓n ≡ Y↓{n+k} \modulo {m↑\prime}$.\xskip (b)(i) Obvious.\xskip
(ii) Theorem A.\xskip (iii) $(a↑n - 1)/(a - 1) ≡ 0 \modulo {2↑e}$
iff $a↑n ≡ 1 \modulo {2↑{e+1}}$; and if $a \neqv -1$,
the order of $a$ modulo $2↑{e+1}$ is twice its order modulo
$2↑e$.\xskip (iv) $(a↑n - 1)/(a - 1) ≡ 0 \modulo {p↑e}$ iff $a↑n
≡ 1$.

\ansno 21. $X↓{n+s} ≡ X↓n+X↓s$ by Eq. 3.2.1--6; and $s$ is a divisor of
$m$, since $s$ is a power of $p$ when $m$ is a power of $p$. Hence
a given integer $q$ is a multiple of $m/s$ iff $X↓{qs} ≡ 0$ iff $q$ is
a multiple of $m/\gcd(X↓s,m)$.

\ansbegin{3.2.1.3}

\ansno 1. $c = 1$ is always relatively
prime to $B↑5$; and every prime dividing $m = B↑5$ is a divisor
of $B$, so it divides $b = B↑2$ to at least the second power.

\ansno 2.  3, so the generator is not recommended in spite
of its long period.

\ansno 3.  The potency is 18 in both cases (see next exercise).

\ansno 4.  Since $a \mod 4 = 1$, we must have $a \mod 8
= 1$ or 5, so $b \mod 8 = 0$ or 4. If $b$ is an odd multiple of
4, and if $b↓1$ is a multiple of 8, clearly $b↑s ≡ 0 \modulo
{2↑e}$ implies that $b↑{s}↓{1} ≡ 0 \modulo {2↑e}$, so $b↓1$
cannot have higher potency.

\ansno 5.  The potency is the smallest value of $s$ such
that $f↓js ≥ e↓j$ for all $j$.

\ansno 6.  $m$ must be divisible by $2↑7$ or by $p↑4$ (for
odd prime $p$) in order to have a potency as high as 4. The
only values are $m = 2↑{27} + 1$ and $10↑9 - 1$.

\ansno 7.  $a↑\prime = (1 - b + b↑2 - \cdots)\mod m$,
where the terms in $b↑s, b↑{s+1}$, etc., are dropped (if $s$
is the potency).

\ansno 8.  Since $X↓n$ is always odd,
$$X↓{n+2} = (2↑{34} + 3 \cdot 2↑{18} + 9)X↓n \mod 2↑{35} = (2↑{34}
+ 6X↓{n+1} - 9X↓n)\mod 2↑{35}.$$
Given $Y↓n$ and $Y↓{n+1}$, the possibilities for
$$Y↓{n+2} \approx \left( 5 + 6(Y↓{n+1} + \epsilon ↓1) - 9(Y↓n
+ \epsilon ↓2)\right)\mod 10,$$
with $0 ≤ \epsilon ↓1 < 1,\ 0 ≤ \epsilon ↓2 < 1$,
are limited and nonrandom.

{\sl Note:} If the multiplier suggested in exercise
3 were, say, $2↑{33} + 2↑{18} + 2↑2 + 1$, instead of $2↑{23} +
2↑{14} + 2↑2 + 1$, we would similarly find $X↓{n+2} - 10X↓{n+1}
+ 25X↓n ≡\hbox{constant}\modulo{2↑{35}}$. In general, we do not
want $a \pm \delta$ to be divisible by high powers of 2 when
$\delta$ is small, else we get ``second order impotency.'' See
Section 3.3.4 for a more detailed discussion.

The generator that appears in this exercise is
discussed in an article by MacLaren and Marsaglia, {\sl JACM \bf 12}
(1965), 83--89. The deficiencies of such generators were first
demonstrated by M. Greenberger, {\sl CACM \bf 8} (1965), 177--179.
Yet generators like this were still in widespread use
more than ten years later (cf.\ the discussion of \.{RANDU} in
Section 3.3.4).
%folio 662 galley 4 (C) Addison-Wesley 1978 	*
\ansbegin{3.2.2}

\ansno 1. The method is useful
only with great caution. In the first place, $aU↓n$ is likely
to be so large that the addition of $c/m$ that follows will
lose almost all significance, and the ``mod 1'' operation will
nearly destroy any vestiges of significance that might remain.
We conclude that double-precision floating-point arithmetic is necessary.
Even with double precision, one must be sure that no rounding, etc.,
occurs to affect the numbers of the sequence in any way, since
that would destroy the theoretical grounds for the good behavior
of the sequence.\xskip (But see exercise 23.)

\ansno 2.  $X↓{n+1}$ equals either $X↓{n-1} + X↓n$ or
$X↓{n-1} + X↓n - m$. If $X↓{n+1} < X↓n$ we must have $X↓{n+1}
= X↓{n-1} + X↓n - m;$ hence $X↓{n+1} < X↓{n-1}$.

\ansno 3. (a)\9 The underlined numbers are $V[j]$ after step M3.
\def\\#1{$\underline{#1}$}
\def\0{\hskip 4.5pt} 
\def\3{$\ldots$}
$$\vbox{\baselineskip0pt\lineskip0pt\halign{\tabskip 4.5pt
\rt{#}⊗\rt{#}⊗#⊗\lft{#}\tabskip0pt\cr
\noalign{\hrule}
Output:⊗initial⊗\vrule height 10.25pt depth 4pt⊗
0\04\05\06\02\00\03\hbox to 4.5pt{\hfill(}2\07\04\0
1\06\03\00\05)\qquad and repeats.\cr
\noalign{\hrule}
$V[0]$:⊗0⊗\vrule height 10.25pt depth 3pt⊗
\\4\0\\7\07\07\07\07\07\07\0\\4\0\\7\07\07\07\07\07\07\0\\4\0\\7\0\3\cr
$V[1]$:⊗3⊗\vrule height 9.25pt depth 3pt⊗
3\03\03\03\03\03\0\\2\0\\5\05\05\05\05\05\05\0\\2\0\\5\05\05\0\3\cr
$V[2]$:⊗2⊗\vrule height 9.25pt depth 3pt⊗
2\02\02\02\0\\0\0\\3\03\03\03\03\03\03\0\\0\0\\3\03\03\03\03\0\3\cr
$V[3]$:⊗5⊗\vrule height 9.25pt depth 4pt⊗
5\05\0\\6\0\\1\01\01\01\01\01\01\0\\6\0\\1\01\01\01\01\01\01\0\3\cr
\noalign{\hrule}
$X$:⊗⊗\vrule height 10.25pt depth 3pt⊗
4\07\06\01\00\03\02\05\04\07\06\01\00\03\02\05\04\07\0\3\cr
$Y$:⊗⊗\vrule height 9.25pt depth 4pt⊗
0\01\06\07\04\05\02\03\00\01\06\00\04\05\02\03\00\01\0\3\cr
\noalign{\hrule}}}$$
So the potency has been reduced to 1! (See further comments in the
answer to exercise 15.)

(b)\9 The underlined numbers are $V[j]$ after step B2.
$$\vbox{\baselineskip0pt\lineskip0pt\halign{
\tabskip 4.5pt\rt{#}⊗\rt{#}⊗#⊗\lft{#}\cr
\noalign{\hrule}
Output:⊗initial⊗\vrule height 10.25pt depth 4pt⊗
2\03\06\05\07\00\00\05\03\0\3\04\06\hbox to 4.5pt{\hfill(}3\0
0\0\3\04\07\hbox to 4.5pt{)\hfill}\3\cr
\noalign{\hrule}
$V[0]$:⊗0⊗\vrule height 10.25pt depth 3pt⊗
0\00\00\00\00\00\0\\5\0\\4\04\0\3\01\01\01\01\0\3\01\01\0\3\cr
$V[1]$:⊗3⊗\vrule height 9.25pt depth 3pt⊗
3\0\\6\0\\1\01\01\01\01\01\01\0\3\00\00\00\0\\4\0\3\00\00\0\3\cr
$V[2]$:⊗2⊗\vrule height 9.25pt depth 3pt⊗
\\7\07\07\07\0\\3\03\03\03\0\\7\0\3\06\0\\2\02\02\0\3\07\0\\2\0\3\cr
$V[3]$:⊗5⊗\vrule height 9.25pt depth 4pt⊗
5\05\05\0\\0\00\0\\2\02\02\02\0\3\0\\3\03\0\\5\05\0\3\0\\3\03\0\3\cr
\noalign{\hrule}
$X$:⊗4⊗\vrule height 10.25pt depth 4pt⊗
7\06\01\00\03\02\05\04\07\0\3\03\02\05\04\0\3\03\02\0\3\cr
\noalign{\hrule}}}$$
In this case the output is considerably better than the input, it enters a
repeating cycle of length 40 after 46 steps: 236570 05314 72632 40110 37564
76025 12541 73625 03746 (30175 24061 52317 46203 74531 60425 16753 02647).
The cycle can be found easily by applying the method of exercise 3.1-7 to the
above array until a column is repeated.

\ansno 4.  The low-order byte of many random sequences
(e.g., linear congruential sequences with $m = \hbox{word size}$) is
much less random than the high-order byte. See Section 3.2.1.1.

\ansno 5.  The randomizing effect would be quite minimized,
because $V[j]$ would always contain a number in a certain
range, essentially $j/k ≤ V[j]/m < (j + 1)/k$. However, some
similar approaches could be used: we could take $Y↓n = X↓{n-1}$,
or we could choose $j$ from $X↓n$ by extracting some digits
from about the middle instead of at the extreme left. None of
these suggestions would produce a lengthening of the period analogous
to the behavior of Algorithm B.

\ansno 6.  For example, if $\.X↓n/m < {1\over 2}$, then
$\.X↓{n+1} = 2\.X↓n$.

\ansno 7.  [W. Montel, {\sl Nieuw Archief voor Wiskunde}
(2) {\bf 1} (1897), 172--184.]$$
\def\\{\hskip 4.5pt}
\hbox{
\vbox{\halign{\rt{#}\cr
The subsequence of\cr
\.X values:\cr}}
\quad
$\vcenter{\halign{\ctr{#}\cr
00\301\cr 00\310\cr .\\.\\.\cr 10\300\cr \.{CONTENTS(A)}\cr}}$
\quad becomes:\quad
$\vcenter{\halign{\ctr{#}\cr
00\301\cr 00\310\cr .\\.\\.\cr 10\300\cr 00\300\cr \.{CONTENTS(A)}\cr}}$
}$$

\ansno 8. We may assume that $X↓0 = 0,\ m = p↑e$,
as in the proof of Theorem 3.2.1.2A\null. First suppose that the
sequence has period length $p↑e$; it follows that the period
of the sequence mod $p↑f$ has length $p↑f$, for $1 ≤ f ≤ e$, otherwise some residues
mod $p↑f$ would never occur. Clearly, $c$ is not a multiple
of $p$, for otherwise each $X↓n$ would be a multiple of $p$.
If $p ≤ 3$, it is easy to establish the necessity of conditions
(iii) and (iv) by trial and error, so we may assume that $p
≥ 5$. If $d \neqv 0 \modulo{p}$ then $dx↑2 + ax + c ≡ d(x
+ a↓1)↑2 + c↓1 \modulo {p↑e}$ for some integers $a↓1$ and $c↓1$
and for all $x$; this quadratic takes the same value at the
points $x$ and $-x - 2a↓1$, so it cannot assume all values modulo
$p↑e$. Hence $d ≡ 0 \modulo{p}$; and if $a \neqv 1$,
we would have $dx↑2 + ax + c ≡ x \modulo {p}$ for some $x$,
contradicting the fact that the sequence mod $p$ has period
length $p$.

To show the sufficiency of the conditions, we
may assume by Theorem 3.2.1.2A and consideration of some trivial
cases that $m = p↑e$ where $e ≥ 2$. If $p = 2$, we have $X↓{n+p}
≡ X↓n + pc \modulo {p↑2}$, by trial; and if $p = 3$, we have
either $X↓{n+p} ≡ X↓n + pc \modulo {p↑2}$, for all $n$, or
$X↓{n+p} ≡ X↓n - pc \modulo {p↑2}$, for all $n$. For $p ≥ 5$,
we can prove that $X↓{n+p} ≡ X↓n + pc \modulo {p↑2}$: Let $d
= pr$, $a = 1 + ps$. Then if $X↓n ≡ cn + pY↓n \modulo {p↑2}$,
we must have $Y↓{n+1} ≡ n↑2c↑2r + ncs + Y↓n \modulo {p}$; hence
$Y↓n ≡ {n\choose 3}2c↑2r + {n\choose 2}(c↑{2}r + cs) \modulo
{p}$. Thus $Y↓p \mod p = 0$, and the desired relation has been
proved.

Now we can prove that the sequence $\langle X↓n\rangle$
of integers defined in the ``hint'' satisfies the relation
$$X↓{n+p↑f} ≡ X↓n + tp↑f\quad \modulo {p↑{f+1}},\qquad
n ≥ 0,$$
for some $t$ with $t \mod p ≠ 0$, and for all
$f ≥ 1$. This suffices to prove that the sequence $\langle X↓n
\mod p↑e\rangle$ has period length $p↑e$, for the length of
the period is a divisor of $p↑e$ but not a divisor of $p↑{e-1}$.
The above relation has already been established for $f = 1$,
and for $f > 1$ it can be proved by induction in the following
manner: Let
$$X↓{n+p↑f} ≡ X↓n + tp↑f + Z↓np↑{f+1}\quad \modulo
{p↑{f+2}};$$
then the quadratic law for generating the sequence,
with $d = pr$, $a = 1 + ps$, yields
$Z↓{n+1} ≡ 2rtnc + st + Z↓n\modulo {p}$.
It follows that $Z↓{n+p} ≡ Z↓n \modulo {p}$; hence
$$X↓{n+kp↑f} ≡ X↓n + k(tp↑f + Z↓np↑{f+1})\quad \modulo
{p↑{f+2}}$$
for $k = 1, 2, 3, \ldotss$; setting $k = p$ completes
the proof.

{\sl Notes:} If $f(x)$ is a polynomial of degree
higher than 2 and $X↓{n+1} = f(X↓n)$, the analysis is more complicated,
although we can use the fact that $f(m + p↑k) = f(m) + p↑kf↑\prime
(m) + p↑{2k}f↑{\prime\prime} (m)/2! + \cdots$ to prove that many polynomial
recurrences give the maximum period. For example, Coveyou has
proved that the period is $m = 2↑e$ if $f(0)$ is odd, $f↑\prime
(j) ≡ 1$, $f↑{\prime\prime}(j) ≡ 0$, and $f(j + 1) ≡ f(j) + 1 \modulo
{4}$ for $j = 0, 1, 2, 3$.\xskip [{\sl Studies in Applied Math.\ \bf 3} (1967),
70-111.]

\ansno 9. Let $X↓n = 4Y↓n + 2$; then the
sequence $Y↓n$ satisfies the quadratic recurrence $Y↓{n+1} =
(4Y↑{2}↓{n} + 5Y↓n + 1)\mod 2↑{e-2}$.

\ansno 10.  {\sl Case 1:} $X↓0 = 0$, $X↓1 = 1$; hence $X↓n
≡ F↓n$. We seek the smallest $n$ for which $F↓n ≡ 0$ and $F↓{n+1}
≡ 1\modulo {2↑e}$. Since $F↓{2n} = F↓n(F↓{n-1} + F↓{n+1})$, $F↓{2n+1}
= F↑{2}↓{n} + F↑{2}↓{n+1}$, we find by induction on $e$ that,
for $e > 1$, $F↓{3\cdot2↑{e-1}}≡0$ and $F↓{3\cdot2↑{e-1}+1}≡2↑e+1
\modulo{2↑{e+1}}$.
This implies that the period is a divisor of $3 \cdot 2↑{e-1}$
but not a divisor of $3 \cdot 2↑{e-2}$, so it is either $3 \cdot 2↑{e-1}$
or $2↑{e-1}$. But $F↓{2↑{e-1}}$ is always
odd (since only $F↓{3n}$ is even).

{\sl Case 2:} $X↓0 = a$, $X↓1 = b$. Then
$X↓n ≡ aF↓{n-1} + bF↓n$; we need to find the smallest positive
$n$ with $a(F↓{n+1} - F↓n) + bF↓n ≡ a$ and $aF↓n + bF↓{n+1} ≡ b$.
This implies that $(b↑2 - ab - a↑2)F↓n ≡ 0$, $(b↑2 - ab - a↑2)(F↓{n+1}
- 1) ≡ 0$; and $b↑2 - ab - a↑2$ is odd (i.e., prime to $m$)
so the condition is equivalent to $F↓n ≡ 0$, $F↓{n+1} ≡ 1$.

Methods to determine the period of $F↓n$
for any modulus appear in an article by D. D. Wall, {\sl AMM
\bf 67} (1960), 525--532. Further facts about the Fibonacci
sequence mod $2↑e$ have been derived by B. Jansson [{\sl Random
Number Generators} (Stockholm: Almqvist and Wiksell, 1966),
Section 3C1].

\def\\#1{\penalty0\;\biglp\hbox{modulo }f(z)\hbox{ and }#1\bigrp}
\ansno 11. (a)\9 We have $z↑λ = 1 + f(z)u(z) + p↑ev(z)$
for some $u(z), v(z)$, where $v(z) \neqv 0\\{p}$.
By the binomial theorem
$$z↑{λp} = 1 + p↑{e+1}v(z) + p↑{2e+1}v(z)↑2(p - 1)/2$$
plus further terms congruent to zero $\\{p↑{e+2}}$.
Since $p↑e>2$, we have $z↑{λp}≡1+p↑{e+1}v(z)\\{p↑{e+2}}$.
If $p↑{e+1}v(z) ≡ 0 \\{p↑{e+2}}$,
there exist polynomials $a(z), b(z)$ such
that $p↑{e+1}\biglp v(z) + pa(z)\bigrp = f(z)b(z)$. Since $f(0)
= 1$, this implies that $b(z)$ is a multiple of $p↑{e+1}$ (by
Gauss's Lemma 4.6.1G); hence $v(z) ≡ 0\\{p}$,
a contradiction.

(b)\9 If $z↑λ - 1 = f(z)u(z) + p↑ev(z)$, then
$$G(z) = u(z)/(z↑λ - 1) + p↑ev(z)/f(z)(z↑λ - 1);$$
hence $A↓{n+λ} ≡ A↓n \modulo {p↑e}$ for large
$n$. Conversely, if $\langle A↓n\rangle$ has the latter property
then $G(z) = u(z) + v(z)/(1 - z↑λ) + p↑eH(z)$, for some polynomials
$u(z)$ and $v(z)$, and some power series $H(z)$, all with integer
coefficients. This implies the identity $1 - z↑λ = u(z)f(z)(1 - z↑λ)
+ v(z)f(z) + p↑eH(z)f(z)(1 - z↑λ)$; and $H(z)f(z)(1 - z↑λ)$
is a polynomial since the other terms of the equation are polynomials.

(c)\9 It suffices to prove that $λ(p↑e) ≠ λ(p↑{e+1})$
implies that $λ(p↑{e+1}) = pλ(p↑e) ≠ λ(p↑{e+2})$. Applying (a)
and (b), we know that $λ(p↑{e+2}) ≠ pλ(p↑e)$, and that $λ(p↑{e+1})$
is a divisor of $pλ(p↑e)$ but not of $λ(p↑e)$. Hence if $λ(p↑e)
= p↑fq$, where $q \mod p ≠ 0$, then $λ(p↑{e+1})$ must be $p↑{f+1}d$,
where $d$ is a divisor of $q$. But now $X↓{n+p↑{f+1}}
≡ X↓n \modulo {p↑e}$; hence $p↑{f+1}d$ is a multiple
of $p↑fq$, hence $d = q$.\xskip [{\sl Note:} The hypothesis $p↑e >2$
is necessary; for example, let $a↓1 = 4$, $a↓2 = -1$, $k = 2$; then
$\langle A↓n\rangle = 1$, 4, 15, 56, 209, 780, $\ldotss$; $λ(2)
= 2$, $λ(4) = 4$, $λ(8) = 4$.]

(d)\9 $g(z) = X↓0 + (X↓1 - a↓1X↓0)z + \cdots$\par
\rjustline{$+ (X↓{k-1} - a↓1X↓{k-2} - a↓2X↓{k-3} - \cdots
- a↓{k-1}X↓0)z↑{k-1}$.}

(e)\9 The derivation in (b) can be generalized
to the case $G(z) = g(z)/f(z)$; then the assumption of period
length $λ$ implies that $g(z)(1 - z↑λ) ≡ 0 \\{p↑e}$;
we treated only the special case $g(z)
= 1$ above. But both sides of this congruence can be multiplied
by Hensel's $b(z)$, and we obtain $1 - z↑λ ≡ 0 \\{p↑e}$.

{\sl Note:} A more ``elementary'' proof of the
result in (c) can be given without using generating functions,
using methods analogous to those in the answer to exercise 8:
If $A↓{λ+n} = A↓n + p↑eB↓n$, for $n = r, r + 1, \ldotss ,
r + k - 1$ and some integers $B↓n$, then this same relation
holds for {\sl all} $n ≥ r$ if we define $B↓{r+k}, B↓{r+k+1},
\ldots$ by the given recurrence relation. Since the resulting
sequence of $B$'s is some linear combination of shifts of the
sequence of $A$'s, we will have $B↓{λ+n} ≡ B↓n \modulo {p↑e}$
for all large enough values of $n$. Now $λ(p↑{e+1})$ must be
some multiple of $λ = λ(p↑e)$; for all large enough $n$ we have
$A↓{n+jλ} = A↓n + p↑e(B↓n + B↓{n+λ} + B↓{n+2λ} + \cdots
+ B↓{n+(j-1)λ}) ≡ A↓n + jp↑eB↓n \modulo {p↑{2e}}$ for
$j = 1, 2, 3, \ldots\,.$ No $k$ consecutive $B$'s are multiples
of $p$; hence $λ(p↑{e+1}) = pλ(p↑e) ≠ λ(p↑{e+2})$ follows immediately
when $e ≥ 2$. We still must prove that $λ(p↑{e+2}) ≠ pλ(p↑e)$
when $p$ is odd and $e = 1$; here we let $B↓{λ+n} = B↓n +
pC↓n$, and observe that $C↓{n+λ} ≡ C↓n \modulo {p}$ when
$n$ is large enough. Then $A↓{n+p} ≡ A↓n + p↑2\left( B↓n + {p\choose 2}
C↓n\right)\modulo {p↑3}$, and the proof is readily completed.

For the history of this problem, see Morgan Ward,
{\sl Trans. Amer. Math. Soc. \bf 35} (1933), 600--628; see
also D. W. Robinson, {\sl AMM \bf 73} (1966), 619--621.
%folio 666 galley 5 (C) Addison-Wesley 1978	*
\ansno 12. The period length mod 2 can be
at most 4; and the maximum period length mod $2↑{e+1}$ is at
most twice the maximum length mod $2↑e$, by the considerations
of the previous exercise. So the maximum conceivable period
length is $2↑{e+1}$; this is achievable, for example, in the
trivial case $a = 0$, $b = c = 1$.

\ansnos 13,14.  Clearly $Z↓{n+λ} = Z↓n$, so $λ↑\prime$
is certainly a divisor of $λ$. Let the least common multiple
of $λ↑\prime$ and $λ↓1$ be $λ↑{\prime}↓{1}$, and define $λ↑{\prime}↓{2}$
similarly. $X↓n + Y↓n ≡ Z↓n ≡ Z↓{n+λ↓1↑\prime} ≡ X↓n +
Y↓{n+λ↓1↑\prime}$, so $λ↓1↑\prime$ is a multiple of $λ↓2$.
Similarly, $λ↑{\prime}↓{2}$
is a multiple of $λ↓1$. This yields the desired result.\xskip (The
result is ``best possible'' in the sense that sequences for
which $λ↑\prime = λ↓0$ can be constructed, as well as sequences
for which $λ↑\prime = λ$.)

\ansno 15.  Algorithm M generates $(X↓{n+k}, Y↓n)$ in
step M1 and outputs $Z↓n = X↓{n+k-q↓n}$ in step M3, for
all sufficiently large $n$. Thus $\langle Z↓n\rangle$ has a period of length
$λ↑\prime $, where $λ↑\prime$ is the least positive integer such
that $X↓{n+k-q↓n} = X↓{n+λ↑\prime+k-q↓{n+λ↑\prime}}$
for all large $n$. Since $λ$ is a
multiple of $λ↓1$ and $λ↓2$, it follows that $λ↑\prime$ is a
divisor of $λ$.\xskip (These observations are due to Alan G. Waterman.)

We also have $n + k - q↓n ≡ n + λ↑\prime + k -
q↓{n+λ↑\prime}\modulo {λ↓1}$ for all large $n$, by the distinctness
of the $X$'s. The bound on $\langle q↓n\rangle$ implies that
$q↓{n+λ↑\prime} = q↓n + c$ for all large $n$, where $c ≡ λ↑\prime
\modulo {λ↓1}$ and $\leftv c\rightv < {1\over 2}λ↓1$. But $c$ must be
0 since $\langle q↓n\rangle$ is bounded. Hence $λ↑\prime ≡ 0
\modulo {λ↓1}$, and $q↓{n+λ↑\prime}= q↓n$ for all large $n$;
it follows that $λ↑\prime$ is a multiple of $λ↓2$ and $λ↓1$,
so $λ↑\prime = λ$.

{\sl Note:} The answer to exercise 3.2.1.2--4
implies that when $\langle Y↓n\rangle$ is a linear congruential
sequence of maximum period modulo $m = 2↑e$, the period length
$λ↓2$ will be at most $2↑{e-2}$ when $k$ is a power of 2.

\ansno 16.  There are several methods of proof.

(1)\9 Using the theory of finite fields.\xskip
In the field with $2↑k$ elements let $\xi$ satisfy $\xi ↑k =
a↓1\xi ↑{k-1} + \cdots + a↓k$. Let $f(b↓1\xi ↑{k-1} + \cdots
+ b↓k) = b↓k$, where each $b↓j$ is either zero or one; this
is a linear function. If word \.X in the generation algorithm
is $(b↓1b↓2 \ldotsm b↓k)↓2$ before (10) is executed, and if $b↓1\xi
↑{k-1} + \cdots + b↓k\xi ↑0 = \xi ↑n$, then word \.X represents
$\xi ↑{n+1}$ after (10) is executed. Hence the sequence is $f(\xi
↑n)$, $f(\xi ↑{n+1})$, $f(\xi ↑{n+2})$, $\ldotss$; and $f(\xi ↑{n+k})
= f(\xi ↑n\xi ↑k) = f(a↓1\xi ↑{n+k-1} + \cdots + a↓k\xi ↑n)
= a↓1f(\xi ↑{n+k-1}) + \cdots + a↓kf(\xi ↑n)$.

(2)\9 Using brute force, or elementary ingenuity.\xskip
We are given a sequence $X↓{nj}$, $n ≥ 0$, $1 ≤ j ≤ k$, satisfying
$$X↓{(n+1)j} ≡ X↓{n(j+1)} + a↓jX↓{n1},\qquad 1 ≤ j < k;\qquad
X↓{(n+1)k} ≡ a↓kX↓{n1}\quad \modulo 2.$$
We must show that this implies $X↓{nk} ≡ a↓1X↓{(n-1)k}
+ \cdots + a↓kX↓{(n-k)k}$, for $n ≥ k$. Indeed, it implies
$X↓{nj} ≡ a↓1X↓{(n-1)j} + \cdots + a↓kX↓{(n-k)j}$ when $1 ≤
j ≤ k ≤ n$. This is clear for $j = 1$, since $X↓{n1} ≡ a↓1X↓{(n-1)1}
+ X↓{(n-1)2} ≡ a↓1X↓{(n-1)1} + a↓2X↓{(n-2)2} + X↓{(n-2)3}$,
etc. For $j > 1$, we have by induction
$$\eqalign{X↓{nj} ⊗≡ X↓{(n+1)(j-1)} - a↓{j-1}X↓{n1}\cr
\noalign{\vskip 6pt}
⊗≡ \sum ↓{1
≤i≤k} a↓iX↓{(n+1-i)(j-1)} - a↓{j-1} \sum ↓{1≤i≤k} a↓iX↓{(n-i)1}\cr
⊗≡ \sum ↓{1≤i≤k} a↓i(X↓{(n+1-i)(j-1)} - a↓{j-1}X↓{(n-i)1})\cr
⊗≡ a↓1X↓{(n-1)j} + \cdots + a↓kX↓{(n-k)j}.\cr}$$
This proof does {\sl not} depend on the fact that operations
were done modulo 2, or modulo any prime number.

\ansno 17.  (a) When the sequence terminates, the $(k
- 1)$-tuple $(X↓{n+1}, \ldotss, X↓{n+k-1})$ occurs for the $(m
+ 1)$st time. A given $(k - 1)$-tuple $(X↓{r+1}, \ldotss, X↓{r+k-1})$
can have only $m$ distinct predecessors $X↓r$, so one of these
occurrences must be for $r = 0$.\xskip (b) Since the $(k - 1)$-tuple
$(0, \ldotss , 0)$ occurs $(m + 1)$ times, each possible predecessor
appears, so the $k$-tuple $(a↓1, 0, \ldotss , 0)$ appears for
all $a↓1, 0 ≤ a↓1 < m$. Let $1 ≤ s < k$ and suppose we have
proved that all $k$-tuples $(a↓1, \ldotss , a↓s, 0, \ldotss ,
0)$ appear in the sequence when $a↓s ≠ 0$. By the construction,
this $k$-tuple would not be in the sequence unless $(a↓1, \ldotss
, a↓s, 0, \ldotss , 0, y)$ had appeared earlier for $1 ≤ y <
m$. Hence the $(k - 1)$-tuple $(a↓1, \ldotss , a↓s, 0, \ldotss
, 0)$ has appeared $m$ times, and all $m$ possible predecessors
appear; this means $(a, a↓1, \ldotss , a↓s, 0, \ldotss , 0)$ appears
for $0 ≤ a < m$. The proof is now complete by induction.

The result also follows from Theorem 2.3.4.2D\null,
using the directed graph of exercise 2.3.4.2--23; the set of
arcs from $(x↓1, \ldotss , x↓j, 0, \ldotss , 0)$ to $(x↓2, \ldotss
, x↓j, 0, \ldotss , 0, 0)$, where $x↓j ≠ 0$ and $1 ≤ j ≤ k$,
forms an oriented subtree related neatly to Dewey decimal notation.

\ansno 18.  The third-most-significant bit of $U↓{n+1}$
is completely determined by the first and third bits of $U↓n$,
so only 32 of the 64 possible pairs $(\lfloor 8U↓n\rfloor ,
\lfloor 8U↓{n+1}\rfloor )$ occur.\xskip ({\sl Notes:} If we had
used, say, 11-bit numbers $U↓n = (.X↓{11n}X↓{11n+1} \ldotsm X↓{11n+10})↓2$,
the sequence {\sl would} be satisfactory for many applications.
If another constant appears in \.A having more ``one'' bits,
the generalized spectral test might give some indication of
its suitability. See exercise 3.3.4--2; we could examine $\nu
↓t$ in dimensions $t = 36, 37, 38, \ldotss\,$.)

\ansno 21. [{\sl J. London Math. Soc. \bf 21} (1946),
169--172.]\xskip Any sequence of period length $m↑k - 1$ with no $k$
consecutive zeroes leads to a sequence of period length $m↑k$
by inserting a zero in the appropriate place, as in exercise
7; conversely, we can start with a sequence of period length
$m↑k$ and delete an appropriate zero from the period, to form
a sequence of the other type. Let us call these ``$(m, k)$ sequences''
of types A and B. The hypothesis assures us of the existence
of $(p, k)$ sequences of type A, for all primes $p$ and all
$k ≥ 1$; hence we have $(p, k)$ sequences of type B for all
such $p$ and $k$.


To get a $(p↑e, k)$ sequence of type B, let
$e = qr$, where $q$ is a power of $p$ and $r$ is not a multiple
of $p$. Start with a $(p, qrk)$ sequence of type A, namely
$X↓0, X↓1, X↓2, \ldotss $; then (using the $p$-ary number system)
the grouped digits $(X↓0 \ldotsm X↓{q-1})↓p, (X↓q \ldotsm X↓{2q-1})↓p,
\ldots$ form a $(p↑q, rk)$ sequence of
type A, since $q$ is relatively prime to $p↑{qrk} - 1$ and
the sequence therefore has a period length of $p↑{qrk} - 1$.
This leads to a $(p↑q, rk)$ sequence $\langle Y↓n\rangle$ of
type B; and $(Y↓0Y↓1 \ldotsm Y↓{r-1})↓{p↑q}$, $(Y↓rY↓{r+1}
\ldotsm Y↓{2r-1})↓{p↑q}$, $\ldots$ is a $(p↑{qr}, k)$ sequence
of type B by a similar argument, since $r$ is relatively prime
to $p↑{qk}$.

To get an $(m, k)$ sequence of type B for arbitrary
$m$, we can combine $(p↑e, k)$ sequences for each of the prime
power factors of $m$ using the Chinese remainder theorem; but
a simpler method is available. Let $\langle X↓n\rangle$ be an
$(r, k)$ sequence of type B, and let $\langle Y↓n\rangle$
be an $(s, k)$ sequence of type B, where $r$ and $s$ are relatively
prime; then $\langle sX↓n + Y↓n\rangle$ is an $(rs, k)$ sequence
of type B.

A simple, uniform construction that yields $(2,
k)$ sequences for arbitrary $k$ has been discovered by A. Lempel
[{\sl IEEE Trans.\ \bf C--19} (1970), 1204--1209].

\ansno 22.  By the Chinese remainder theorem, we can find constants
$a↓1, \ldotss , a↓k$ having desired residues mod each prime
divisor of $m$. If $m = p↓1p↓2 \ldotsm p↓t$, the period
length will be lcm$(p↑{k}↓{1} - 1, \ldotss , p↑{k}↓{t}
- 1)$. In fact, we can achieve reasonably long periods for arbitrary
$m$ (not necessarily squarefree), as shown in exercise 11.

\ansno 23.  Period length at least $2↑{55} - 1$; possibly
faster than (7), see exercise 3.2.1.1--5. Furthermore, R. Brent
has pointed out that the calculations can be done exactly on
floating-point numbers in $[\,0, 1)$.

\ansno 24.  Run the sequence backwards; i.e., if $Z↓n
= Y↓{-n}$ we have $Z↓n = (Z↓{n-m+k} + Z↓{n-m})\mod 2$.

\ansno 25.  This actually would be slower and more complicated, unless it
can be used to save subroutine-calling overhead in high-level languages.\xskip
(See the FORTRAN program in Section 3.6.)

\ansno 27. Let $J↓n=\lfloor k X↓n/m\rfloor$.\xskip {\sl Lemma:} After the
$(k↑2+7k-2)/2$ consecutive values
$$0↑{k+2}\ 1\ 0↑{k+1}\ 2\ 0↑k\ \ldots\ (k-1)\ 0↑3$$
occur in the $\langle J↓n \rangle$ sequence, Algorithm B will have $V[i]
< m/k$ for $0≤j<k$, and also $Y<m/k$.\xskip {\sl Proof.} Let $S↓n$ be the
set of positions $i$ such that $V[i]<m/k$ just before $X↓n$ is generated,
and let $j↓n$ be the index such that $V[j↓n]←X↓n$. If $j↓n \notin S↓n$ and
$J↓n=0$, then $S↓{n+1}=S↓n ∪ \{j↓n\}$; if $j↓n \in S↓n$ and $J↓n=0$, then
$S↓{n+1}=S↓n$ and $j↓{n+1}=0$. After $k+2$ successive 0's, we must therefore
have $0\in S↓n$ and $j↓{n+1}=0$. Then after ``$1\ 0↑{k+1}$'' we must have
$\{0,1\} \subset S↓n$ and $j↓{n+1}=0$; after ``$2\ 0↑k$'' we must have
$\{0,1,2\} \subset S↓n$ and $j↓{n+1}=0$; and so on.

{\sl Corollary:} For $λ≥2(k↑2+7k-2) k↑{(k↑2+7k-2)/2}$, either Algorithm B
yields a period of length $λ$ or the sequence $\langle X↓n\rangle$ is poorly
distributed.\xskip {\sl Proof.} The probability that
any given length-$l$ pattern of $J$'s does not occur in a random sequence of
length $λ$ is less than $(1-k↑{-l})↑{λ/l}<\exp(-k↑{-l}λ/l)$. For $l=(k↑2+7k-2)/2$
this is at most $e↑{-4}$; hence the stated pattern should appear. After it
does, the subsequent behavior of Algorithm B will be the same each time it
reaches this part of the period.

\ansbegin{3.3.1}

\ansno 1. There are $k = 11$
categories, so the line $\nu = 10$ should be used.

\ansno 2.  ${2\over 49}$, ${3\over 49}$, ${4\over 49}$,
${5\over 49}$, ${6\over 49}$, ${9\over 49}$, ${6\over 49}$,
${5\over 49}$, ${4\over 49}$, ${3\over 49}$, ${2\over 49}$.

\ansno 3.  $V = 7{173\over 240}$, only very slightly higher
than that obtained from the good dice! There are two reasons
why we do not detect the weighting: (a) The new probabilities
(cf.\ exercise 2) are not really very far from the old ones in
Eq.\ (1). The sum of the two dice tends to smooth out the probabilities;
if we considered instead each of the 36 possible pairs of values,
and counted these, we would probably detect the difference quite
rapidly (assuming that the two dice are distinguishable).\xskip (b) A far
more important reason is that $n$ is too small for a significant
difference to be detected. If the same experiment is done for
large enough $n$, the faulty dice will be discovered (see exercise
12).

\ansno 4.  $p↓s = {1\over 12}$ for $2 ≤ s ≤ 12$ and $s
≠ 7$; $p↓7 = {1\over 6}$. The value of $V$ is 16${1\over 2}$,
which falls between the $75\%$ and $95\%$ entries in Table 1;
so it is reasonable, in spite of the fact that not too
many sevens actually turned up.

\ansno 5.  $K↑{+}↓{20} = 1.15; K↑{-}↓{20} = 0.215$; these
do not differ significantly from random behavior (being at about
the $94\%$ and $86\%$ levels), but they are mighty close.\xskip
(The data values in this exercise come from Appendix A, Table
1.)

\ansno 6.  The probability that $X↓j ≤ x$ is $F(x)$, so
we have the binomial distribution
discussed in Section 1.2.10. $F↓n(x) = s/n$ with probability
${n\choose s}F(x)↑s\biglp 1 - F(x)\bigrp ↑{n-s}$;
the mean is $F(x)$; the standard deviation is $\sqrt{F(x)\biglp
1 - F(x)\bigrp /n}$.\xskip (Cf.\ Eq. 1.2.10--19. This suggests that
a slightly better statistic would be to define
$$K↑{+}↓{n} = \sqrt n\max↓{-∞<x<∞}\biglp F↓n(x)
- F(x)\bigrp /\sqrt{F(x)\biglp 1 - F(x)\bigrp},$$
etc.)\xskip {\sl Notes:} Similarly, we can calculate
the mean and standard deviation of $F↓n(x) - F↓n(y)$, for $x
< y$, and obtain the covariance of $F↓n(x), F↓n(y)$. Using these
facts, it can be shown that for large values of $n$ the function
$F↓n(x)$ behaves as a ``Brownian motion,'' and techniques from
this branch of probability theory may be used to study it. The
situation is exploited in articles by J. L. Doob and M. D. Donsker,
{\sl Annals Math.\ Stat.\ \bf 20} (1949), 393--403
and {\bf 23} (1952), 277--281; this is generally regarded as the most
enlightening way to study the KS tests.
%folio 670 galley 6 (C) Addison-Wesley 1978	*
\ansno 7. $\biglp$(Cf.\ Eq.\ (13).$\bigrp$\xskip Take
$j = n$ to see that $K↑{+}↓{n}$ is never negative and it can
get as high as $\sqrt{n}$. Similarly, take $j = 1$ to make the
same observations about $K↑{-}↓{n}$.

\ansno 8.  The new KS statistic was computed for 20 observations.
The distribution of $K↑{+}↓{10}$ was used as $F(x)$ when the
KS statistic was computed.

\ansno 9.  The idea is erroneous, because all of the
observations must be {\sl independent.} There is a relation
between the statistics $K↑{+}↓{n}$ and $K↑{-}↓{n}$ on the same
data, so each test should be judged separately.\xskip (A high value
of one tends to give a low value of the other.)\xskip Similarly, the
entries in Figs.\ 2 and 5 (which show 15 tests for each generator)
do not show 15 independent observations, because the ``maximum
of 5'' test is not independent of the ``maximum of 4'' test.
The three tests of each horizontal row are independent (because
they were done on different parts of the sequence), but the
five tests in a column are somewhat correlated. The net effect
of this that the 95-percent probability levels, etc., which
apply to one test, cannot legitimately be applied to a whole
group of tests on the same data. Moral: When testing a random-number
generator, we may expect it to ``pass'' each of several tests,
e.g., the frequency test, maximum test, run test, etc.; but
an array of data from several different tests should not be
considered as a unit since the tests themselves may not be independent.
The $K↑{+}↓{n}$ and $K↑{-}↓{n}$ statistics should be considered
as two separate tests; a good source of random numbers will
pass both tests.

\ansno 10.  Each $Y↓s$ is doubled, and $np↓s$ is doubled,
so the numerators of (6) are quadrupled while the denominators
only double. Hence the new value of $V$ is twice as high as
the old one.

\ansno 11.  The empirical distribution function stays
the same; the values of $K↑{+}↓{n}$ and $K↑{-}↓{n}$ are multiplied
by $\sqrt{2}$.

\ansno 12.  Let $Z↓s = (Y↓s - nq↓s)/\sqrt{nq↓s}$. The
value of $V$ is $n$ times
$$\sum ↓{1≤s≤k}(q↓s - p↓s + \sqrt{q↓s/n}Z↓s)↑2/p↓s,$$
and the latter quantity stays bounded away from
zero as $n$ increases (since $Z↓sn↑{-1/4}$ is bounded
with probability 1). Hence the value of $V$ will increase to
a value that is extremely improbable under the $p↓s$ assumption.

For the KS test, let $F(x)$ be the assumed distribution,
$G(x)$ the actual distribution, and let $h = \max \left|G(x) - F(x)\right|$.
Take $n$ large enough so that $\left|F↓n(x)
- G(x)\right| > h/2$ occurs with very small probability; then
$\left|F↓n(x) - F(x)\right|$ will be
improbably high under the assumed distribution $F(x)$.

\ansno 13.  (The ``max'' notation should really be replaced
by ``sup'' since a least upper bound is meant; however, ``max''
was used in the text to avoid confusing too many readers by
the less familiar ``sup'' notation.) For convenience, let $X↓0
= -∞$, $X↓{n+1} = +∞$. When $X↓j ≤ x < X↓{j+1}$, we have $F↓n(x)
= j/n$, and in this interval $\max\biglp F↓n(x) - F(x)\bigrp
= j/n - F(X↓j)$; $\max\biglp F(x) - F↓n(x)\bigrp = F(X↓{j+1})
- j/n$. For $0 ≤ j ≤ n$, all real values of $x$ are considered;
this proves that
$$\eqalign{
K↑{+}↓{n}⊗=\sqrt n \max↓{0≤j≤n}\left({j\over n} - F(X↓j)\right);\cr
K↑{-}↓{n}⊗=\sqrt n \max↓{1≤j≤n+1}\left(F(X↓j) - {j - 1\over n}\right).\cr}$$
These are equivalent to (13), since the extra term under
the maximum signs is nonpositive and it must be redundant by
exercise 7.

\ansno 14.  The logarithm of the left-hand side simplifies
to
$$\halign{\hbox to size{#}\cr
$\dispstyle-\sum ↓{1≤s≤k}Y↓s \ln\left(1 + {Z↓s\over
\sqrt{np↓s}}\right) + {1 - k\over 2}\ln(2πn)$\hfill\cr
\hfill$\dispstyle - {1\over 2}\sum ↓{1≤s≤k}\ln p↓s - {1\over 2}
\sum ↓{1≤s≤k}\ln\left(1 + {Z↓s\over \sqrt{np↓s}}\right)+O\left(1\over n\right)$\cr
}$$
and this quantity simplifies further (upon expanding $\ln(1
+Z↓s/\sqrt{np↓s})$ and realizing that $\sum ↓{1≤s≤k}
Z↓s\sqrt{np↓s} = 0$) to
$$-{1\over 2} \sum ↓{1≤s≤k} Z↑{2}↓{s} + {1 - k\over 2}
\ln(2πn) - {1\over 2} \ln(p↓1 \ldotsm p↓k) + O\left(1\over
\sqrt{n}\right).$$

\ansno 15. The corresponding Jacobian determinant
is easily evaluated by (a) removing the factor $r↑{n-1}$ from
the determinant, (b) expanding the resulting determinant by
the cofactors of the row containing ``$\cos \theta ↓1$ $- \sin
\theta ↓1$ $0 \ldotsm 0$'' (each of the cofactor determinants
may be evaluated by induction), and (c) recalling that $\sin↑2
\theta ↓1 + \cos↑2 \theta ↓1 = 1$.

\ansno 16. $\dispstyle \int ↑{z\sqrt{2x}+y}↓{0}\hskip-3pt\exp\left(- {u↑2\over 2x}
+\cdots\right)\,du = ye↑{-z↑2} + O\left(1\over
\sqrt{x}\right) + \int ↑{z\sqrt{2x}}↓{0}\hskip-3pt\exp\left(-{u↑2\over
2x}+\cdots\right)\,du.$

\vskip 11pt plus 3pt minus 8pt
\noindent The latter integral is
$$\int ↑{z\sqrt{2x}}↓{0}e↑{-u↑2/2x}\,du + {1\over 3x↑2} \int
↑{z\sqrt{2x}}↓{0}e↑{-u↑2/2x}u↑3\,du + O\left(1\over
\sqrt{x}\right).$$
When all is put together, the final result is
$${\gamma (x + 1, x + z\sqrt{2x} + y)\over\Gamma
(x + 1)} = {1\over \sqrt{2π}} \int ↑{z\sqrt2}↓{-∞}e↑{-u↑2/2}
\,du + {e↑{-z↑2}\over\sqrt{2πx}}{\textstyle(y
- {2\over 3} - {2\over 3}z↑2)} + O\left(1\over \sqrt x\right).$$
If we set $z\sqrt{2} = x↓p$ and write
$${1\over \sqrt{2π}} \int ↑{z\sqrt2}↓{-∞}e↑{-u↑2/2}
\,du = p,\qquad x + 1 = {\nu \over 2} ,\qquad \gamma \left({\nu
\over 2} , {t\over 2}\right)\left/\Gamma \left(\nu \over 2\right)\right.
= p,$$
where $t/2 = x + z\sqrt{2x} + y$, we can solve
for $y$ to obtain $y = {2\over 3}(1 + z↑2) + O(1/\sqrt{x}\,)$,
which is consistent with the above analysis. The solution is
therefore $t = \nu + 2\sqrt{\nu }z + {4\over 3}z↑2 - {2\over
3} + O(1/\sqrt{\nu }\,)$.

\ansno 17.  (a) Change of variable, $x↓j ← x↓j + t$.\xskip (b)
Induction on $n$; by definition,
$$P↓{n0}(x - t) = \int ↑{x}↓{n}P↓{(n-1)0}(x↓n - t)\,dx↓n.$$
(c) The left-hand side is
$$\int ↑{x+t}↓{n}\,dx↓n\; \ldots \int ↑{x↓{k+2}}↓{k+1}\,dx↓{k+1}\qquad
\hbox{times}\qquad \int ↑{k}↓{t}\,dx↓k \int ↑{x↓k}↓{t}\,dx↓{k-1}\; \ldots
\int ↑{x↓2}↓{t}\,dx↓1.$$
(d) From (b), (c) we have
$$P↓{nk}(x) = \sum ↓{0≤r≤k} {(r - t)↑r\over r!} {(x + t - r)↑{n-r-1}\over
(n - r)!} (x + t - n).$$
The numerator in (24) is $P↓{n\lfloor t\rfloor}(n)$.

\ansno 18.  We may assume that $F(x) = x$ for $0 ≤ x ≤
1$, as remarked in the text's derivation of (24). If $0 ≤ X↓1
≤ \cdots ≤ X↓n ≤ 1$, let $Z↓j = 1 - X↓{n+1-j}$. We have $0
≤ Z↓1 ≤ \cdots ≤ Z↓n ≤ 1$; and $K↑{+}↓{n}$ evaluated for $X↓1,
\ldots , X↓n$ equals $K↑{-}↓{n}$ evaluated for $Z↓1, \ldots
, Z↓n$. This symmetrical relation gives a one-to-one correspondence
between sets of equal volume for which $K↑{+}↓{n}$ and $K↑{-}↓{n}$
fall in a given range.

\ansno 23. Let $m$ be any number $≥n$.\xskip (a)
If $\lfloor mF(X↓i)\rfloor = \lfloor mF(X↓j)\rfloor$ and $i
> j$ then $i/n - F(X↓i) > j/n - F(X↓j)$.\xskip (b) Start with $a↓k
= 1.0$, $b↓k = 0.0$, $c↓k = 0$ for $0 ≤ k < m$. Then do the following for
each observation $X↓j$: Set $Y ← F(X↓j)$, $k ← \lfloor mY\rfloor$,
$a↓k ← \min(a↓k, Y)$, $b↓k ← \max(b↓k, Y)$, $c↓k ← c↓k + 1$.\xskip$\biglp$Assume
that $F(X↓j) < 1$ so that $k < m$.$\bigrp$\xskip Then set $j ← 0$, $r↑+ ← r↑-
← 0$, and for $k = 0$, 1, $\ldotss$, $m - 1$ (in this order) do
the following whenever $c↓k > 0$: Set $r↑- ← \max(r↑-, a↓k
- j/n)$, $j ← j + c↓k$, $r↑+ ← \max(r↑+, j/n - b↓k)$. Finally set
$K↑{+}↓{n} ← \sqrt{n}r↑+$, $K↑{-}↓{n} ← \sqrt{n}r↑-$. The time
required is $O(m + n)$, and the precise value of $n$ need not
be known in advance.\xskip (If the estimate $(k + {1\over 2})/m$ is
used for $a↓k$ and $b↓k$, so that only the values $c↓k$ are actually
computed for each $k$, we obtain estimates of $K↑{+}↓{n}$ and
$K↑{-}↓{n}$ good to within ${1\over2}\sqrt{n}/m$, even when $m < n$.)\xskip
[{\sl ACM Trans.\ Math.\ Software \bf 3} (1977), 60--64.]
%folio 674 galley 7 (C) Addison-Wesley 1978	*
\ansbegin{3.3.2}

\ansno 1. The observations for
a chi-square test must be independent, and in the second sequence
successive observations are manifestly dependent, since the
second component of one equals the first component of the next.

\ansno 2.  Form $t$-tuples $(Y↓{jt}, \ldotss, Y↓{jt+t-1})$, for
$0 ≤ j < n$, and count how many of these equal each possible
value. Apply the chi-square test with $k = d↑t$ and with probability
$1/d↑t$ in each category. The number of observations, $n$, should
be at least $5d↑t$.

\ansno 3.  The probability that $j$ values are examined,
i.e., the probability that $U↓{j-1}$ is the $n$th element of
the sequence lying in the range $α ≤U↓{j-1} < β$, is easily
seen to be
$${j-1\choose n-1}p↑n(1 - p)↑{j-n},$$
by enumeration of the possible places in which
the other $n - 1$ occurrences can appear and by evaluating the
probability of such a pattern. The generating function is $G(z)
= \biglp pz/(1 - (1 - p)z)\bigrp ↑n$, which makes sense since
the given distribution is the $n$-fold convolution of the same
thing for $n = 1$. Hence the mean and variance are proportional
to $n$; the number of $U$'s to be examined is now easily found
to have the characteristics $\biglp$min\penalty200\ $n$, ave $n/p$, max $∞$,
dev $\sqrt{n(1 - p)/p}\bigrp$. A more detailed discussion of
this probability distribution when $n = 1$ may be found in the
answer to exercise 3.4.1--17; see also the considerably more
general results of exercise 2.3.4.2--26.

\ansno 4.  The probability of a gap of length $≥r$ is
the probability that $r$ consecutive $U$'s lie outside the given
range, i.e., $(1 - p)↑r$. The probability of a gap of length
exactly $r$ is the above value for length $≥r$ minus the value
for length $≥(r + 1)$.

\ansno 5.  As $N$ goes to infinity, so does $n$ (with
probability 1), hence this test is just the same as the gap
test described in the text except for the length of the very
last gap. And the text's gap test certainly is asymptotic to
the chi-square distribution stated, since the length of each
gap is independent of the length of the others.\xskip [{\sl
Notes:} A quite complicated proof of this result by E. Bofinger
and V. J. Bofinger appears in {\sl Annals Math.\ Stat.\ 
\bf 32} (1961), 524--534. Their paper is noteworthy because
it discusses several interesting variations of the gap test;
they show, for example, that the quantity
$$\sum ↓{0≤r≤t} {\biglp Y↓r - (Np)p↓r\bigrp ↑2\over (Np)p↓r}$$
does {\sl not} approach a chi-square distribution,
although others had suggested this statistic as a ``stronger''
test because $Np$ is the expected value of $n$.]

\ansno 7.  5, 3, 5, 6, 5, 5, 4.

\ansno 8.  See exercise 10, with $w = d$.

\ansno 9.  (Change $d$ to $w$ in steps C1 and C4.) We
have
$$\eqalign{p↓r⊗ = {d(d - 1)\ldotsm(d - w + 1)\over d↑r}{r-1\comb\{\}
 w - 1}\qquad \hbox{for }w ≤ r < t;\cr
\noalign{\vskip 3pt}
p↓t⊗= 1 - {d!\over d↑{t-1}}\left({1\over 0!}{t-1\comb\{\}
d} + \cdots + {1\over (d - w)!} {t-1\comb\{\}w}\right).\cr}$$

\ansno 10. As in exercise 3, we really need consider
only the case $n = 1$. The generating function for the probability
that a coupon set has length $r$ is
$$G(z) = {d!\over (d - w)!} \sum ↓{r>0}{r-1\comb\{\} w-1}\left( z\over d\right)↑r
= z↑w\left(d-1\over d-z\right)\ldotsm\left(d-w+1\over d-(w-1)z\right)$$
by the previous exercise and Eq. 1.2.9--28. The mean and variance are readily
computed using Theorem 1.2.10A and exercise 3.4.1--17. We find
that
$$\eqalign{\hbox{mean}(G)⊗= w + \left({d\over d - 1} - 1\right)+\cdots+
\left({d\over d-w+1}-1\right)=d(H↓d-H↓{d-w})=\mu;\cr
\hbox{var}(G) ⊗= d↑2(H↑{(2)}↓{d} - H↑{(2)}↓{d-w})
- d(H↓d-H↓{d-w}) = \sigma ↑2.\cr}$$
The number of $U$'s examined, as the search for a coupon
set is repeated $n$ times, therefore has the characteristics
$\biglp$min $wn$, ave $\mu n$, max $∞$, dev $\sigma \sqrt{n}\bigrp$.

\def\0{\hbox to 7pt{}}
\def\\{\hbox to 7pt{\hfill\vrule height 9.944pt depth 3pt\hfill}}
\ansno 11.  \\1\\2\\9\08\05\03\\6\\7\00\\4\\.

\ansno 12.  {\bf Algorithm R} ({\sl Data for run test\/}){\bf.}

\algstep R1. [Initialize.] Set $j
← -1$, and set $\.{COUNT}[1] ← \.{COUNT}[2] ← \cdots ←
\.{COUNT}[6] ← 0$}. Also set $U↓n ← U↓{n-1}$, for convenience in terminating the
algorithm.

\algstep R2. [Set $r$ zero.] Set $r ← 0$.

\algstep R3. [Is $U↓j < U↓{j+1}?]$
Increase $r$ and $j$ by 1. If $U↓j < U↓{j+1}$, repeat this step.

\algstep R4. [Record the length.] If $r ≥
6$, increase $\.{COUNT}[6]$ by one, otherwise increase $\.{COUNT}[r]$
by one.

\algstep R5. [Done?] If $j < n - 1$, return
to step R$↓2$.\quad\blackslug

\def\\{\mathrel{\vcenter{\hbox{>}\vskip-4pt\hbox{<}}}}
\ansno 13. There are $(p +
q + 1){p+q\choose p}$ ways to have $U↓{i-1} \\ U↓i < \cdots
< U↓{i+p-1}\\ U↓{i+p} < \cdots < U↓{i+p+q-1}$; subtract
$p+q+1\choose p+1$ of these in which $U↓{i-1} < U↓i$, and
subtract $p+q+1\choose 1$ for those in which $U↓{i+p-1} <
U↓{i+p}$; then add in 1 for the case that both $U↓{i-1} < U↓i$
and $U↓{i+p-1} < U↓{i+p}$, since this case has been subtracted
out twice.\xskip (This is a special case of the ``inclusion-exclusion''
principle, which is explained further in Section 1.3.3.)

\ansno 14.  A run of length $r$ occurs with probability
$1/r! - 1/(r + 1)!$, assuming distinct $U$'s.

\ansno 15.  This is always true of $F(X)$ when $F$ is
continuous and $S$ has distribution $F;$ see Section 3.3.1C.

\ansno 16.  (a) $Z↓{jt} = \max(Z↓{j(t-1)}, Z↓{(j+1)(t-1)})$.
If the $Z↓{j(t-1)}$ are stored in memory, it is therefore a
simple matter to transform this array into the set of $Z↓{jt}$
with no auxiliary storage required.\xskip (b) With his ``improvement,''
each of the $V$'s should indeed have the stated distribution,
but the observations are no longer independent. In fact, when
$U↓j$ is a relatively large value, all of $Z↓{jt}$, $Z↓{(j-1)t}$,
$\ldotss$, $Z↓{(j-t+1)t}$ will be equal to $U↓j$; so we almost
have the effect of repeating the same data $t$ times (and that
would multiply $V$ by $t$, cf.\ exercise 3.3.1--10).

\ansno 17.  (b) By Lagrange's identity, the difference
is $\sum ↓{0≤k<j<n}(U↑{\prime}↓{k}V↑{\prime}↓{j}
- U↑{\prime}↓{j}V↑{\prime}↓{k})↑2$ and this is certainly positive.\xskip
(c) Therefore if $D↑2 = N↑2$, we must have $U↑{\prime}↓{k}V↑{\prime}↓{j}
- U↑{\prime}↓{j}V↑{\prime}↓{k} = 0$, for all pairs $j, k$. This
means that the matrix
$$\left(\vcenter{\halign{\ctr{$#↑\prime↓1$}⊗ \ctr{$#↑\prime↓2$}$\ldotsm$⊗\ctr
{$#↑\prime↓{n-1}$}\cr
U⊗U⊗U\cr V⊗V⊗V\cr}}\right)$$
has rank $<2$, so its rows are linearly dependent.\xskip
(A more elementary proof can be given, using the fact that $U↑\prime↓0V↑\prime↓j
- U↑{\prime}↓{j}V↑{\prime}↓{0} = 0$ for $ 1 ≤ j < n$ implies the
existence of constants $α, β$ such that $αU↑{\prime}↓{j} + βV↑{\prime}↓{j}
= 0$ for all $j$, provided that $U↑{\prime}↓{0}$ and $V↑{\prime}↓{0}$
are not both zero; the latter case can be avoided by a suitable
renumbering.)

\ansno 18.  (a) The numerator is $-(U↓0 - U↓1)↑2$, the
denominator is $(U↓0 - U↓1)↑2$.\xskip (b) The numerator is $-(U↑{2}↓{0}
+ U↑{2}↓{1} + U↑{2}↓{2} - U↓0U↓1 - U↓1U↓2 - U↓2U↓0)$; the denominator
is $2(U↑{2}↓{0} + \cdots - U↓2U↓0)$.\xskip (c) The denominator always
equals $\sum ↓{0≤j<k<n}(U↓j - U↓k)↑2$, by exercise
1.2.3--30 or 1.2.3--31.

\ansno 21. The successive values of $c↓{r-1}=s-1$ in step P2 are
2, 3, 7, 6, 4, 2, 2, 1, 0; hence $f=886862$.

\ansno 22.  $1024=6!+2\cdot5!+2\cdot4!+2\cdot3!+2\cdot2!+0\cdot1!$, so we
want the successive values of $s-1$ in step P2 to be 0, 0, 0, 1, 2, 2, 2, 2, 0;
working backwards, the permutation is $(9,6,5,2,3,4,0,1,7,8)$.

\ansbegin{3.3.3}

\ansno 1. $y((x/y))
+ {1\over 2}y - {1\over 2}y\delta (x/y)$.

\ansno 2. See exercises 1.2.4--38 and 1.2.4--39(a), (b), (g).

\ansno 3.  $f(x) = \sum ↓{n≥1}(-\sin 2πnx)/n$, which
converges for all $x$.\xskip (The representation in Eq.\ (24) may be
considered a ``finite'' Fourier series, for the case when $x$
is rational.)

\ansno 4.  $d = 2↑{10}\cdot 5$. Note that we have $X↓{n+1}
< X↓n$ with probability ${1\over 2} + \epsilon $, where
$$|\epsilon | < d/(2\cdot10↑10) = 1/(2\cdot5↑9);$$
hence {\sl every} potency-10 generator is respectable
from the standpoint of Theorem P.

\ansno 5.  An intermediate result:
$$\sum ↓{0≤x<m} {x\over m} {s(x)\over m} = {1\over 12} \sigma
(a, m, c) + {m\over 4} - {c\over 2m} - {x↑\prime \over 2m}.$$
%folio 678 galley 8 (not on paper tape) (C) Addison-Wesley 1978	*
\ansno 6. (a)\9 Use induction and the formula
$$\left(\left(hj+c\over k\right)\right)-\left(\left(hj+c-1\over k\right)\right)
={1\over k}-{1\over 2}\delta\left(hj+c\over k\right)-{1\over2}\delta\left(
hj+c-1\over k\right).$$
(b) Use the fact that $\dispstyle-\left(\left(h↑\prime j\over k\right)\right)
=-\left(\left({j\over hk}-{k↑\prime j\over h}\right)\right)=
\left(\left(k↑\prime j\over h\right)\right)-{j\over hk} + {1\over2}\delta
\left(k↑\prime j\over h\right)$.

\ansno 7. Take $m=h$, $n=k$, $k=2$ in the second formula of exercise 1.2.4--45:
$$\twoline{\sum↓{0<j<k}\biggglp\hskip-2pt 
{hj\over k}-\left(\left(hj\over k\right)\right)+
{1\over 2}\hskip-2pt\bigggrp
\biggglp\hskip-2pt {hj\over k}-\left(\left(hj\over k\right)\right)-
{1\over 2}\hskip-2pt\bigggrp+2\<\sum↓{0<j<h}
\biggglp\hskip-2pt {kj\over h}-\left(\left(kj\over h\right)
\right)+{1\over 2}\bigggrp j}{-4pt}{=kh(h-1).\hskip-9pt}$$
The sums on the left simplify, and by standard manipulations we get
$$h↑2k-hk-{h\over2}+{h↑2\over6k}+{k\over12}+{1\over4}-{h\over6}\sigma(h,k,0)
-{h\over6}\sigma(k,h,0)+{1\over12}\sigma(1,k,0)=h↑2k-hk.$$
Since $\sigma(1,k,0)=(k-1)(k-2)/k$, this reduces to the reciprocity law.

\ansno 8. See {\sl Duke Math.\ J. \bf21} (1954), 391--397.

\ansno 9. Begin with the handy identity $\sum↓{0≤k<r}\lfloor kp/r\rfloor\lfloor kq/r
\rfloor+\sum↓{0≤k<p}\lfloor kq/p\rfloor\lfloor kr/p\rfloor+\sum↓{0≤k<q}
\lfloor kr/q\rfloor\lfloor kp/q\rfloor = (p-1)(q-1)(r-1)$ for which a simple
geometric proof is possible.\xskip [U. Dieter, {\sl Abh.\ Math.\ Sem.\ Univ.\ 
Hamburg \bf21} (1957), 109--125.]

\ansno 10. Obviously $\sigma(k-h,k,c)=-\sigma(h,k,-c)$, cf.\ (8). Replace $j$
by $k-j$ in definition (16), to deduce that $\sigma(h,k,c)=\sigma(h,k,-c)$.

\vskip 12pt plus 3pt minus 9pt
\ansno 11. (a)\9$\dispstyle\sum↓{0≤j<dk}\left(\left(j\over dk\right)\right)\left(
\left(hj+c\over k\right)\right)=\sum↓{\scriptstyle 0≤i<d\atop\scriptstyle 0≤j<k}
\left(\left(ik+j\over dk\right)\right)\left(\left(hj+c\over k\right)\right)$;
use (10) to sum on $i$.\xskip (b) $\dispstyle\left(\left(hj+c+\theta\over k\right)
\right)=\left(\left(hj+c\over k\right)\right)+{\theta\over k}-{1\over2}\delta
\left(hj+c\over k\right)$; now sum.

\vskip 12pt plus 3pt minus 9pt
\ansno 12. Since $\left(\left(hj+c\over k\right)\right)$ runs through the same
values as $\left(\left(j\over k\right)\right)$ in some order, Cauchy's
inequality implies that $\sigma(h,k,c)↑2≤\sigma(h,k,0)↑2$; and $\sigma(1,k,0)$
may be summed directly, cf.\ exercise 7.

\ansno 13. $\dispstyle\sigma(h,k,c)+{3(k-1)\over k}={12\over k}\sum↓{0<j<k}
{\omega↑{-cj}\over(\omega↑{-hj}-1)(\omega↑j-1)}+{6\over k}(c\mod k)
-6\left(\left(h↑\prime c\over k\right)
\right)$,\par\noindent if $hh↑\prime≡1\modulo k$.

\ansno 14. $(2↑{38}-3\cdot2↑{20}+5)/(2↑{70}-1)\approx2↑{-32}$. An extremely
satisfactory global value, in spite of the local nonrandomness!

\ansno 15. Replace $c↑2$ where it appears in (19) by $\lfloor c\rfloor\lceil
c\rceil$.

\ansno 16. The hinted identity is equivalent to $m↓1=p↓rm↓{r+1}+p↓{r-1}m↓{r+2}$
for $1≤r≤t$; this follows by induction, cf.\ also exercise 4.5.3--32. Now
replace $c↓j$ by $\sum↓{j≤r≤t}b↓rm↓{r+1}$ and compare coefficients of $b↓ib↓j$
on both sides of the identity to be proved.

({\sl Note:} For all exponents $e≥1$ we have
$$\sum↓{1≤j≤t}(-1)↑{j+1}{c↑e↓j\over m↓jm↓{j+1}}={1\over m↓1}\sum↓{1≤j≤t}
(-1)↑{j+1}b↓j{(c↑e↓j-c↑e↓{j+1})\over c↓j-c↓{j+1}}p↓{j-1}$$
by a similar argument.)

\ansno 17. During this algorithm we will have $k=m↓j$, $h=m↓{j+1}$, $c=c↓j$,
$p=p↓j-1$, $p↑\prime=p↓{j-2}$, $s=(-1)↑{j+1}$ for $j=1$, 2, $\ldotss$, $t+1$.

\algstep D1. [Initialize.] Set $A←0$, $B←h$, $p←1$, $p↑\prime
←0$, $s←1$.

\algstep D2. [Divide.] Set $a←\lfloor k/h\rfloor$, $b←\lfloor c/h\rfloor$,
$r←c\mod h$.\xskip (Now $a=a↓j$, $b=b↓j$, $r=c↓{j+1}$.)

\algstep D3. [Accumulate.] Set $A←A-(a-6b)s$, $B←B+6bp(c+r)s$. If $r≠0$ or
$c=0$, set $A←A-3s$. If $h=1$, set $B←B+ps$.\xskip (This subtracts $3e(m↓{j+1},c↓j)$
and also takes care of the $\sum(-1)↑{j+1}/m↓jm↓{j+1}$ terms.)

\algstep D4. [Prepare for next iteration.] Set $c←r$, $s←-s$; set $r←k-ah$,
$k←h$, $h←r$; set $r←ap+p↑\prime$, $p↑\prime←p$, $p←r$. If $h>0$, return to
D2.\quad\blackslug

\penalty-200 % This will help the following page, the way things are breaking
\yyskip At the conclusion of this algorithm, $p$ will be equal to the original value
$k↓0$ of $k$, so the desired answer will be $A+B/p$. The final value of $p↑\prime$
will be $h↑\prime$ if $s<0$, otherwise $p↑\prime$ will be $k↓0-h↑\prime$.
It would be possible to
maintain $B$ in the range $0≤B<k↓0$, by making appropriate adjustments to
$A$, thereby requiring only single-precision operations (with double-precision
products and dividends) if $k↓0$ is a single-precision number.

\ansno 18. A moment's thought shows that the formula
$$\textstyle S(h,k,c,z)=\sum↓{0≤j<k}
\biglp\lfloor j/k\rfloor-\lfloor(j-z)/k\rfloor\bigrp\biglp\biglp(hj+c)/k\bigrp
\bigrp$$ is in fact valid for all $z$, not only when $k≥z$. Writing
$\lfloor j/k\rfloor-\lfloor(j-z)/k\rfloor={z\over k}+\left(\left(j-z\over k\right)
\right)-\left(\left(j\over k\right)\right)+{1\over2}\delta↓{j0}-{1\over2}\delta
\left(j-z\over k\right)$
and carrying out the sums yields $S(h,k,c,z)=zd((c/d))/k+{1\over12}\sigma(h,k,hz+c)
-{1\over12}\sigma(h,k,c)+{1\over2}((c/k))-{1\over2}\biglp\biglp(hz+c)/k\bigrp
\bigrp$, where $d=\gcd(h,k)$.

% This paragraph break introduced to fill up a sparse page
[This formula allows us to express the probability that
$X↓{n+1}<X↓n<α$ in terms of generalized Dedekind sums, given $α$.]

\ansno 19. The desired probability is
$$
\def\\{\lower 1.36pt\nullα} % makes α have an effective tail like β
\halign{\hbox to size{$\textstyle#$\hfill}\cr
\sum↓{0≤x<m}\biglp\lfloor(x-α)/m\rfloor-
\lfloor(x-β)/m\rfloor\bigrp\biglp\lfloor(s(x)-α↑\prime)/m\rfloor-
\lfloor(s(x)-β↑\prime)/m\rfloor\bigrp/m\cr
\noalign{\vskip4pt}
\quad=\sum↓{0≤x<m}\left({β-α\over m}+\left(\left(x-β\over m
\right)\right)-\left(\left(x-\\\over m\right)\right)+{1\over2}\delta\left(x-\\
\over m\right)-{1\over2}\delta\left(x-β\over m\right)\right)\times\null\cr
\noalign{\penalty1000}
\def\biglp{\mathopen{\vcenter{\hbox{\:@\char0}}}}
\def\bigrp{\mathclose{\vcenter{\hbox{\:@\char1}}}}
\quad\qquad
\biglp{β↑\prime-α↑\prime\over m}+\biglp\biglp{ax+c-β↑\prime\over m}\bigrp \bigrp 
-\biglp\biglp{ax+c-\\↑\prime\over m}\bigrp \bigrp +{1\over2}\delta\biglp
{ax+c-\\↑\prime\over m}\bigrp -{1\over 2}\delta\biglp{ax+c-β↑\prime\over m}\bigrp 
\bigrp /m\cr
\noalign{\vskip4pt}
\quad={β-α\over m}{β↑\prime-α↑\prime\over m}+{1\over12m}
\biglp\sigma(a,m,c+aα-α↑\prime)-\sigma(a,m,c+aα-β↑\prime)+\null\cr
\noalign{\penalty1000}
\hskip 120pt\sigma(a,m,c+aβ-β↑\prime)
-\sigma(a,m,c+aβ-α↑\prime)\bigrp+ε,\cr}$$
where $|ε|≤2.5/m$.

[This approach is
due to U. Dieter. The discrepancy between the true probability and the
ideal value ${β-α\over m}{β↑\prime-α↑\prime\over m}$ is
bounded by $\sum↓{1≤j≤t}a↓j/4m$, according to Theorem K\null; conversely, by
choosing $α$, $β$, $α↑\prime$, $β↑\prime$ appropriately we will obtain a
discrepancy of at least half this bound when there are large partial quotients,
by the fact that Theorem K is ``best possible.'' Note that when $a
\approx \sqrt m$ the discrepancy cannot exceed $O\biglp1/\sqrt m\,\bigrp$, so
even the locally nonrandom generator of exercise 14 will look good on the
serial test over the full period; it appears that we should insist on an
{\sl extremely} small discrepancy.]

\ansno 20. $\sum↓{0≤x<m}\left.\left\lceil\biglp x-s(x)\bigrp/m\right\rceil
\left\lceil\biglp s(x)-s(s(x))\bigrp/m\right\rceil\right/m
\hskip0ptplus3pt=\hskip0ptplus3pt
\sum↓{0≤x<m}\biglp\biglp x-s(x)\bigrp/m+\biglp\biglp(bx+c)/m\bigrp\bigrp+{1\over2}
\bigrp\biglp\biglp s(x)-s(s(x))\bigrp/m+\biglp\biglp a(bx+c)/m\bigrp\bigrp
+{1\over2}\bigrp\hbox{\:a/}m$; and $x/m=((x/m))+{1\over2}-{1\over2}\delta(x/m)$,
$s(x)/m=\biglp\biglp(ax+c)/m\bigrp\bigrp+{1\over2}-{1\over2}\delta\biglp(ax+c)
/m\bigrp$, $s(s(x))/m=\biglp\biglp(a↑2x+ac+c)/m\bigrp\bigrp+{1\over2}-{1\over2}
\delta\biglp(a↑2x+ac+c)/m\bigrp$. Let $s(x↑\prime)=s(x↑{\prime\prime})=0$ and
$d=\gcd(b,m)$. The sum now reduces to
$$\twoline{{1\over4}+{1\over12m}(S↓1-S↓2+S↓3-S↓4+S↓5-S↓6+S↓7-S↓8+S↓9)+
{d\over2m}\biggglp\left(\left(c\over d\right)\right)+\left(\left(ac\over d\right)
\right)\bigggrp}{2pt}{\null+{1\over2m}\biggglp\left(\left(x↑\prime-x↑{\prime\prime}
\over m\right)\right)-\left(\left(x↑\prime\over m\right)\right)+\left(\left(
x↑{\prime\prime}\over m\right)\right)+\left(\left(ac+c\over m\right)\right)-
\left(\left(c\over m\right)\right)-{1\over2}\bigggrp,}$$
where $S↓1=\sigma(a,m,c)$, $S↓2=\sigma(a↑2,m,ac+c)$, $S↓3=\sigma(ab,m,ac)$,
$S↓4=\sigma(1,m,0)=(m-\penalty1000 1)\*(m-2)/m$, $S↓5=\sigma(a,m,c)$, $S↓6=\sigma(b,m,c)$,
$S↓7=-\sigma(a↑\prime-1,m,a↑\prime c)$, $S↓8=-\sigma(a↑\prime(a↑\prime-1),\penalty0
m,(a↑\prime)↑2c)$, if $a↑\prime a≡1\modulo m$; and
$$\eqalign{S↓9⊗=12\sum↓{0≤x<m}\left(\left(bx+c\over m\right)\right)
\left(\left(a(bx+c)\over m\right)\right)\cr⊗=12d\sum↓{0≤x<m/d}\left(\left(x+c↓0/d
\over m/d\right)\right)\left(\left(a(x+c↓0/d\over m/d\right)\right)\cr
⊗=12d\sum↓{0≤x<m/d}\biggglp\left(\left(x\over m/d\right)\right)+
{c↓0\over m}-{1\over2}\delta↓{x0}\bigggrp\left(\left(a(x+c↓0/d)\over m/d
\right)\right)\cr
⊗=d\;\biggglp\sigma(ad,m,ac↓0)+12{c↓0\over m}\left(\left(ac↓0\over d\right)\right)
-6\left(\left(ac↓0\over m\right)\right)\bigggrp\cr}$$
where $c↓0=c\mod d$. The grand total will be near $1\over6$ when $d$ is small
and when the fractions $a/m$, $(a↑2\mod m)/m$, $(ab\mod m)/m$, $b/m$, $(a↑\prime-1)
/m$, $(a↑\prime(a↑\prime-1)\mod m)/m$, $((ad)\mod m)/m$ all have small partial
quotients.\xskip (Note that $a↑\prime-1≡-b+b↑2-\cdotss$, cf.\ exercise 3.2.1.3--7.)

\ansno 21. $C=(s-({1\over2})↑2)/({1\over3}-({1\over2})↑2)$, where
$$\eqalign{s↓n⊗=\int↓{x↓n}↑{x↓{n+1}}x\{ax+\theta\}\,dx={1\over a↑2}\left(
{1\over3}-{\theta\over2}+{n\over 2}\right),\qquad
\hbox{if }x↓n={n-\theta\over a};\cr
s⊗=\int↓0↑1 x\{ax+\theta\}\,dx=s↓0+s↓1+\cdots+s↓{a-1}+\int↓{-\theta/a}↑0(ax+\theta)
\,dx\cr⊗={1\over3a}-{\theta\over2a}+{a-1\over4a}+{\theta↑2\over2a}.\cr}$$
Therefore $C=(1-6\theta+6\theta↑2)/a$.

\ansno 22. Let $[u,v)$ denote the set $\leftset x \relv u≤x<v\rightset$. We have
$s(x)<x$ in the disjoint intervals
$$\left[{1-\theta\over a},{1-\theta\over a-1}\right),\;
\left[{2-\theta\over a},{2-\theta\over a-1}\right),\;\ldotss,\;
\left[{a-\theta\over a},1\right),$$
which have total length
$$1+\sum↓{0<j≤a-1}\left(j-\theta\over a-1\right)-\sum↓{0<j≤a}\left(j-\theta\over
a\right)=1+{a\over2}-\theta-{a+1\over2}+\theta={1\over2}.$$

\ansno 23. We have $s(s(x))<s(x)<x$ when $x$ is in $\left[{k-\theta\over a},
{k-\theta\over a-1}\right)$ and $ax+\theta-k$ is in $\left[{j-\theta\over a},
{j-\theta\over a-1}\right)$, for $0<j≤k<a$; or when $x$ is in $\left[{a-\theta
\over a},1\right)$ and $ax+\theta-a$ is either in $\left[{j-\theta\over a},{j-\theta
\over a-1}\right)$ for $0<j≤\lfloor a\theta\rfloor$ or in
$\vcenter{\hbox{\:@\char2}}{\lfloor a\theta\rfloor+1-\theta\over a},\theta
\vcenter{\hbox{\:@\char1}}$.
The desired probability is
$$\twoline{\sum↓{0<j≤k<a}{j-\theta\over a↑2(a-1)}+\sum↓{0<j≤\lfloor a\theta\rfloor}
{j-\theta\over a↑2(a-1)}+{1\over a↑2}\max(0,\{a\theta\}+\theta-1)}{0pt}{
={1\over6}+{1\over6a}-{\theta\over2a}+{1\over a↑2}\left({\lfloor a\theta\rfloor
(\lfloor a\theta\rfloor+1-2\theta)\over2(a-1)}+\max(0,\{a\theta\}+\theta-1)\right),
}$$ which is ${1\over6}+(1-3\theta+3\theta↑2)/6a+O(1/a↑2)$ for large $a$. Note
that $1-3\theta+3\theta↑2≥{1\over4}$, so $\theta$ can't be chosen to make this
probability come out right.
%folio 687 galley 1 (C) Addison-Wesley 1978	*
\ansno 24. Proceed as in the previous exercise; the sum of the interval
lengths is
$$\sum↓{0<j↓1≤\cdots≤j↓{t-1}<a}\,{j↓1\over a↑{t-1}(a-1)}=
{1\over a↑{t-1}(a-1)}{a+t-2\choose t}.$$
To compute the average length, let $p↓k$ be the probability of a run of length
$≥k$; the average is
$$\sum↓{k≥1}p↓k=\sum↓{k≥1}{a+k-2\choose k}{1\over a↑{k-1}(a-1)}=\left(a\over
a-1\right)↑a-{a\over a-1}.$$
The value for a truly random sequence would be $e - 1$;
and our value is $e - 1 + (e/2 -\penalty1000 1)/a + O(1/a↑2)$. {\sl Note:} The
same result holds for an ascending run, since $U↓n > U↓{n+1}$
if and only if $1 - U↓n < 1 - U↓{n+1}$. This would lead us to
suspect that runs in linear congruential sequences might
be slightly longer than normal, so the run test should be
applied to such generators.

\ansno 25. $x$ must be in the interval
$[(k + α↑\prime - \theta)/a, (k + β↑\prime - \theta)/a)$ for some $k$,
and also in the interval $[α, β)$. Let $k↓0 = \lceil aα + \theta - β↑\prime \rceil$,
$k↓1 = \lceil aβ + \theta - β↑\prime \rceil$.  
With due regard to boundary
conditions, we get the probability
$$(k↓1 - k↓0)(β↑\prime - α↑\prime )/a + \max
\biglp 0,β - (k↓1 + α↑\prime - \theta)/a\bigrp - \max\biglp
0, α - (k↓0 + α↑\prime - \theta)/a\bigrp.$$
This is $(β - α)(β↑\prime - α↑\prime ) +
\epsilon $, where $|\epsilon | < 2(β↑\prime - α↑\prime )/a$.

\ansno 26. See Fig$.$ A--1; the orderings $U↓1
< U↓3 < U↓2$ and $U↓2 < U↓3 < U↓1$ are impossible; the other
four each have probability ${1\over 4}$.

\topinsert{\vskip 28mm
\hbox to 48mm{\caption Fig.\ A--1. Permutation regions for
Fibonacci generator.}
\vskip 13mm
\hbox to 48mm{\caption Fig.\ A--2. Run-length regions for Fibonacci
generator.}}

\ansno 27. $U↓n = \{F↓{n-1}U↓0 + F↓nU↓1\}$. We need $F↓{k-1}U↓0
+ F↓kU↓1 < 1$ and $F↓kU↓0 + F↓{k+1}U↓1 > 1$. The half-unit-square
in which $U↓0 > U↓1$ is broken up as shown in Fig$.$ A--2, with
various values of $k$ indicated. The probability for a run of
length $k$ is ${1\over 2}$ if $k = 1$; $1/F↓{k-1}F↓{k+1} - 1/F↓kF↓{k+2}$,
if $k > 1$. The corresponding probabilities for a random sequence
are $2k/(k+1)!-2(k+1)/(k+2)!$; the following table compares the
first few values.

\ctrline{$
\vcenter{\halign{#⊗\quad\ctr{$#$}⊗\quad\ctr{$#$}⊗\quad\ctr{$#$}⊗\quad
\ctr{$#$}⊗\quad\ctr{$#$}\cr
\hfill $k$: \hfill⊗1⊗2⊗3⊗4⊗5\cr
Probability in Fibonacci case:\hfill⊗1\over 2⊗1\over
3⊗1\over 10⊗ 1\over 24⊗ 1\over 65\cr
Probability in random case:\hfill⊗1\over 3⊗5\over 12⊗11\over
60⊗19\over 360⊗29\over 2520\cr}}$}

\ansno 28. Fig$.$ A--3 shows the various regions in the
general case. The ``213'' region means $U↓2 < U↓1 < U↓3$, if
$U↓1$ and $U↓2$ are chosen at random; the ``321'' region means
$U↓3 < U↓2 < U↓1$, etc. The probabilities for 123 and 321 are
${1\over 4} - α/2 + α↑2/2$; the probabilities for all other
cases are ${1\over 8} + α/4 - α↑2/4$. To have all equal to
${1\over 6}$, we must have $1 - 6α + 6α↑2 = 0$.\xskip
[This exercise establishes a theorem due to J. N. Franklin,
{\sl Math$.$ Comp$.$ \bf 17} (1963), 28--59, Theorem 13); other results
of Franklin's paper are related to exercises 22 and 23.]

\topinsert{\vskip 104mm
\ctrline{\caption Fig.\ A--3. Permutation regions for a generator
with potency 2; $α = (a - 1)c/m$.}}

\ansbegin{3.3.4}

\ansno 1. $\nu ↓1$ is always $m$ and
$\mu ↓1 = 2$, for generators of maximum period.

\ansno 2. Let $V$ be the matrix whose rows are
$V↓1, \ldotss , V↓t$. To minimize $Y \cdot Y$, subject to the
condition that $Y ≠ (0, \ldotss , 0)$ and $VY$ is an integer
column vector $X$, is equivalent to minimizing $(V↑{-1}X) \cdot
(V↑{-1}X)$, subject to the condition that $X$ is a nonzero integer
column vector. The columns of $V↑{-1}$ are $U↓1$, $\ldotss$, $U↓t$.

\ansno 3. $a↑2 ≡ 2a - 1$ and $a↑3 ≡ 3a - 2 \modulo
m$. By considering all short solutions of (15), we find that
$\nu↓3↑2 = 6$ and $\nu↓4↑2 = 4$, for the respective
vectors $(1, -2, 1)$ and $(1, -1, -1, 1)$, except in the following
cases: $m = 2↑eq$, $q$ odd, $e ≥ 3$, $a ≡ 2↑{e-1} \modulo {2↑e}$,
$a ≡ 1 \modulo q$, $\nu↓3↑2 = \nu↓4↑2 = 2$; $m =
3↑eq$, $\gcd(3, q) = 1$, $e ≥ 2$, $a ≡ 1 \pm 3↑{e-1} \modulo {3↑e}$,
$a ≡ 1 \modulo q$, $\nu↓4↑2 = 2$; $m = 9$, $a = 4$ or 7,
$\nu↓2↑2 = \nu↓3↑2 = 5$.

\ansno 4. (a)\9 The unique choice for $(x↓1, x↓2)$
is ${1\over m}(y↓1u↓{22} - y↓2u↓{21}, -y↓1u↓{12} + y↓2u↓{11})$,
and this is $≡ {1\over m}(y↓1u↓{22} + y↓2au↓{22}, -y↓1u↓{12}
- y↓2au↓{12}) ≡ (0, 0) \modulo 1$; i.e., $x↓1$ and $x↓2$ are
integers.\xskip (b) When $(x↓1, x↓2) ≠ (0, 0)$, we have $(x↓1u↓{11}
+ x↓2u↓{21})↑2 + (x↓1u↓{12} + x↓2u↓{22})↑2 = x↓1↑2(u
↓{11}↑2 + u↓{12}↑2) + x↓2↑2(u↓{21}↑2 + u↓{22}↑2)
+ 2x↓1x↓2(u↓{11}u↓{21} + u↓{12}u↓{22}) ≥ (x↓1↑2
+ x↓2↑2 - |x↓1x↓2|)(u↓{11}↑2 + u↓{12}↑2)
≥ u↓{11}↑2+u↓{12}↑2$.\xskip [Note that this is a stronger
result than Lemma A\null, which tells us only that $x↓1↑2
≤ (u↓{11}↑2 + u↓{12}↑2)(u↓{21}↑2 + u↓{22}↑2)
/m↑2$, $x↓2↑2 ≤ (u↓{11}↑2 + u↓{12}↑2)↑2/m↑2$,
and the latter can be $≥1$. The idea is essentially Gauss's notion
of a reduced binary quadratic form, {\sl Disq.\ Arith.\ } (Leipzig,
1801), $\section 171$.]

\ansno 5. Conditions (30) remain invariant; hence
$h$ cannot be zero in step S2, when $a$ is relatively prime
to $m$. Since $h$ always decreases in that step, S2 eventually
terminates with $u↑2 + v↑2 ≥ s$. Note that $pp↑\prime ≤ 0$ throughout
the calculation.

The hinted inequality surely holds the first time step S2 is performed.
The integer $q↑\prime$ that minimizes $(h↑\prime - qh)↑2
+ (p↑\prime - q↑\prime p)↑2$ is $q↑\prime = \hbox{round}\biglp(h↑\prime
h + p↑\prime p)/(h↑2 + p↑2)\bigrp$, by (24).
If $(h↑\prime - q↑\prime h)↑2 + (p↑\prime - q↑\prime p)↑2 <
h↑2 + p↑2$ we must have $q↑\prime ≠ 0$, $q↑\prime ≠ -1$, hence
$(p↑\prime - q↑\prime p)↑2 ≥ p↑2$, hence $(h↑\prime - q↑\prime
h)↑2 < h↑2$, i.e., $|h↑\prime - q↑\prime h| < h$, i.e., $q↑\prime$ is
$q$ or $q + 1$. We have $hu + pv ≥ h(h↑\prime - q↑\prime h)
+ p(p↑\prime - q↑\prime p) ≥ -{1\over 2}(h↑2 + p↑2)$, so if
$u↑2 + v↑2 < s$ the next iteration of step S2 will preserve
the assumption in the hint. If $u↑2 + v↑2 ≥ s > (u - h)↑2 +
(v - p)↑2$, we have $2|h(u - h) + p(v - p)| = 2\biglp h(h -
u) + p(p - v)\bigrp = (u - h)↑2 + (v - p)↑2 + h↑2
+ p↑2 - (u↑2 + v↑2) ≤ (u - h)↑2 + (v - p)↑2 ≤ h↑2 + p↑2$, hence
$(u - h)↑2 + (v - p)↑2$ is minimal by exercise 4. Finally if
both $u↑2 + v↑2$ and $(u - h)↑2 + (v - p)↑2$ are $≥s$, let $u↑\prime
= h↑\prime - q↑\prime h$, $v↑\prime = p↑\prime - q↑\prime p$;
then $2|hu↑\prime + pv↑\prime | ≤ h↑2 + p↑2 ≤ {u↑\prime}↑2 +
{v↑\prime}↑2$, and $h↑2 + p↑2$ is minimal by exercise 4.

\ansno 6. If $u↑2 + v↑2 ≥ s > (u - h)↑2 + (v -
p)↑2$ in the previous answer, we have $(v - p)↑2 > v↑2$, hence
$(u - h)↑2 < u↑2$; and if $q = a↓j$, so that $h↑\prime = a↓jh
+ u$, we must have $a↓{j+1} = 1$. It follows that $\nu
↓2↑2 = \min↓{0≤j<t}(m↓j↑2 + p↑{2}↓{j-1})$, in
the notation of exercise 3.3.3--16.

Now $m↓0 = m↓jp↓j + m↓{j+1}p↓{j-1} = a↓jm↓jp↓{j-1} + m↓jp↓{j-2}
+ m↓{j+1}p↓{j-1} < (a↓j + 1 + 1/a↓j)m↓jp↓{j-1} ≤ (A + 1 + 1/A)m↓jp↓{j-1}$,
and $m↓j↑2 + p↑{2}↓{j-1} ≥ 2m↓jp↓{j-1}$, hence the result.

\ansno 7. We shall prove, using condition (19),
that $U↓j \cdot U↓k = 0$ for all $k ≠ j$ iff $V↓j \cdot V↓k
= 0$ for all $k ≠ j$. Assume that $U↓j \cdot U↓k = 0$ for all
$k ≠ j$, and let $U↓j = α↓1V↓1 + \cdots + α↓tV↓t$. Then $U↓j
\cdot U↓k = α↓k$ for all $k$, hence $U↓j = α↓jV↓j$, and $V↓j
\cdot V↓k = α↑{-1}↓{j}(U↓j \cdot V↓k) = 0$ for all $k ≠ j$.
A symmetric argument proves the converse.

\ansno 8. Clearly $\nu↓{t+1} ≤ \nu ↓t$ (a fact used
implicitly in Algorithm S\null, since $s$ is not changed when $t$
increases). For $t = 2$ this is equivalent to $(m\mu ↓2/π)↑{1/2}
≥ ({3\over 4}m\mu ↓3/π)↑{1/3}$, i.e., $\mu ↓3 ≤ {4\over 3} \sqrt{m/π}\,
\mu ↑{3/2}↓{2}$. This reduces to ${4\over 3}10↑{-4}/
\sqrt{π}$ with the given parameters, but for large
$m$ and fixed $\mu ↓2$ the bound (39) is better.
%folio 690 galley 2 (C) Addison-Wesley 1978	*
\ansno 9. Let $f(y↓1, \ldotss , y↓t) = \theta $; then
$\gcd(y↓1, \ldotss , y↓t) = 1$, so there is an integer matrix
$W$ of determinant 1 having $(w↓1, \ldotss , w↓t)$ as its first
row.\xskip (Prove the latter fact by induction on the magnitude of
the smallest nonzero entry in the row.)\xskip Now if $X = (x↓1, \ldotss
, x↓t)$ is a row vector, we have $XW = X↑\prime$ iff $X = X↑\prime
W↑{-1}$, and $W↑{-1}$ is an integer matrix of determinant 1,
hence the form $g$ defined by $WU$ satisfies $g(x↓1, \ldotss
, x↓t) = f(x↓1↑\prime , \ldotss , x↓t↑\prime )$;
furthermore $g(1, 0, \ldotss , 0) = \theta $.

Without loss of generality, assume that $f = g$.
If now $S$ is any orthogonal matrix, the matrix $US$ defines
the same form as $U$, since $(XUS)(XUS)↑T = (XU)(XU)↑T$. Choosing
$S$ so that its first column is a multiple of $V↓1$ and its
other columns are any suitable vectors, we have
$$US=\left(\vcenter{\halign{\ctr{$#$}⊗\ctr{$#$}\cr
α↓1⊗\quad α↓2\quad \ldots\quad α↓t\cr
0\cr
\vcenter{\baselineskip4pt\lineskip0pt\hbox{.}\hbox{.}\hbox{.}}⊗U↑\prime\cr
\noalign{\vskip 2pt}
0\cr}}\right)$$
for some $α↓1$, $α↓2$, $\ldotss$, $α↓t$ and some $(t - 1)
\times (t - 1)$ matrix $U↑\prime $. Hence $f(x↓1, \ldotss , x↓t)
= (α↓1x↓1 + \cdots + α↓tx↓t)↑2 + h(x↓2, \ldotss , x↓t)$. It follows
that $α↓1 = \sqrt{\theta }$ \raise 8.5pt\hbox{\:@\char'2}\! % 12pt left bracket
in fact, $α ↓j = (U↓1 \cdot U↓j)/ \sqrt{\theta }$
for $1 ≤ j ≤ t$\raise 8.5pt\hbox{\:@\char'3}, %12pt right bracket
and that $h$ is a positive definite quadratic
form defined by $U↑\prime $, where $\det U↑\prime = (\det U)/
\sqrt{\theta }$. By induction on $t$, there are integers
$(x↓2, \ldotss , x↓t)$ with $h(x↓2, \ldotss , x↓t) ≤ ({4\over
3})↑{(t-2)/2} |\det U|↑{2/(t-1)}/\theta ↑{1/(t-1)}$, and for
these integer values we can choose $x↓1$ so that $|x↓1 + (α↓2x↓2
+ \cdots + α↓tx↓t)/α↓1| ≤ {1\over 2}$, i.e., $(α↓1x↓1 + \cdots
+ α↓tx↓t)↑2 ≤ {1\over 4}\theta $. Hence $\theta ≤ f(x↓1, \ldotss
, x↓t) ≤ {1\over 4}\theta + ({4\over 3})↑{(t-2)/2} |\det U|↑{2/(t-1)}/\theta
↑{1/(t-1)}$ and the desired inequality follows immediately.

[For $t = 2$ the result is best possible. For
general $t$, Hermite's theorem implies that 
$\mu ↓t ≤ π↑{t/2}(4/3)↑{t(t-1)/4}/(t/2)!\,$.
A fundamental theorem due to Minkowski (``Every $t$-dimensional
convex set symmetric about the origin with volume $≥2↑t$ contains
a nonzero integer point'') gives $\mu ↓t ≤ 2↑t$; this is stronger
than Hermite's theorem for $t ≥ 9$. And even stronger results
are known, cf.\ (41).]

\ansno 10. Since $y↓1$ and $y↓2$ are relatively prime, we can
solve $u↓1y↓2 - u↓2y↓1 = m$; furthermore $(u↓1 + qy↓1)y↓2 -
(u↓2 + qy↓2)y↓1 = m$ for all $q$, so we can ensure that $2|u↓1y↓1
+ u↓2y↓2| ≤ y↓1↑2 + y↓2↑2$ by choosing an appropriate
integer $q$. Now $y↓2(u↓1 + au↓2) ≡ y↓2u↓1 - y↓1u↓2 ≡ 0 \modulo
m$, and $y↓2$ must be relatively prime to $m$, hence $u↓1
+ au↓2 ≡ 0$. Finally let $|u↓1y↓1 + u↓2y↓2| = αm$, $u↓1↑2
+ u↓2↑2 = βm$, $y↓1↑2 + y↓2↑2 = \gamma m$;
we have $0 ≤ α ≤ {1\over 2}\gamma $, and it remains to be shown
that $α ≤ {1\over 2}β$ and $β\gamma ≥ 1$. The identity $(u↓1y↓2
- u↓2y↓1)↑2 + (u↓1y↓1 + u↓2y↓2)↑2 = (u↓1↑2 + u
↓2↑2)(y↓1↑2 + y↓2↑2)$ implies that $1 + α↑2 =
β\gamma $. If $α > {1\over 2}β$, we have $2α\gamma > 1 + α↑2$,
i.e., $\gamma - \sqrt{\gamma ↑2 - 1} < α ≤
{1\over 2}\gamma $. But ${1\over 2}\gamma <
\sqrt{\gamma ↑2 - 1}$ implies that $\gamma ↑2 > {4\over 3}$,
a contradiction.

\ansno 11. Since $a$ is odd, $y↓1 + y↓2$ must be even. To avoid
solutions with $y↓1$ and $y↓2$ both even, let $y↓1 = x↓1 + x↓2$,
$y↓2 = x↓1 - x↓2$, and solve $x↓1↑2 + x↓2↑2 = m/
\sqrt{3} - \epsilon $, with $(x↓1, x↓2)$ relatively
prime and $x↓1$ even; the corresponding multiplier $a$ will
be the solution to $(x↓2 - x↓1)a ≡ x↓2 + x↓1 \modulo {2↑e}$.
It is not difficult to prove that $a ≡ 1 \modulo {2↑{k+1}}$
iff $x↓1 ≡ 0 \modulo {2↑k}$, so we get the best potency when
$x↓1 \mod 4 = 2$. The problem reduces to finding relatively prime
solutions to $x↓1↑2 + x↓2↑2 = N$ where $N$ is
a large integer of the form $4k + 1$. By factoring $N$ over
the Gaussian integers, we can see that solutions exist if and
only if each prime factor of $N$ (over the usual integers) has
the form $4k + 1$.

According to a famous theorem of Fermat, every
prime $p$ of the form $4k + 1$ can be written $p = u↑2 + v↑2
= (u + iv)(u - iv)$, $v$ even, in a unique way except for the
signs of $u$ and $v$. The numbers $u$ and $v$ can be calculated
efficiently by solving $x↑2 ≡ -1 \modulo p$, then calculating
$u + iv = \gcd(x + i, p)$ by Euclid's algorithm over the Gaussian
integers.\xskip$\biglp$We can take $x = n↑{(p-1)/4} \mod p$ for almost
half of all integers $n$. This application of a Euclidean algorithm
is essentially the same as finding the least nonzero $u↑2 +
v↑2$ such that $u \pm xv ≡ 0 \modulo p$.$\bigrp$ If the prime factorization
of $N$ is $p↑{e↓1}↓{1} \ldotss p↑{e↓r}↓{r} = (u↓1 + iv↓1)↑{e↓1}
(u↓1 - iv↓1)↑{e↓1} \ldotss (u↓r + iv↓r)↑{e↓r}(u↓r
- iv↓r)↑{e↓r}$, we get $2↑{r-1}$ distinct solutions to
$x↓1↑2 + x↓2↑2 = N$, $\gcd(x↓1, x↓2) = 1$, $x↓1$
even, by letting $|x↓2| + i|x↓1| = (u↓1 + iv↓1)↑{e↓1}(u↓2
\pm iv↓2)↑{e↓2}\ldotss(u↓r \pm iv↓r)↑{e↓r}$; and all
such solutions are obtained in this way.

{\sl Note:} When $m = 10↑e$, a similar procedure
can be used, but it is five times as much work since we must
keep trying until finding a solution with $x↓1 ≡ 0 \modulo
{10}$. For example, when $m = 10↑{10}$ we have $\lfloor m/
\sqrt{3}\rfloor = 5773502691$, and $5773502689 = 53 \cdot
108934013 = (7 + 2i)(7 - 2i)(2203 + 10202i)(2203 - 10202i)$.
Of the two solutions $|x↓2| + i|x↓1| = (7 + 2i)(2203 + 10202i)$
or $(7 + 2i)(2203 - 10202i)$, the former gives $|x↓1| = 67008$
(no good) and the latter gives $|x↓1| = 75820$, $|x↓2| = 4983$
(which is usable). Line 9 of Table 1 was obtained by taking
$x↓1 = 75820$, $x↓2 = -4983$.

Line 20 of the table was obtained as follows:
$\lfloor 2↑{35}/\sqrt{3}\rfloor = 19837604196$; 19837604193
is divisible by 3 so it is ineligible; 19837604189 is divisible
by 19, and 19837604185 by 7, and 19837604181 by 3; 19837604177
is prime and equals $131884↑2 + 49439↑2$. The corresponding multiplier
is 1175245817; a better one could be found if we continued searching.
The multiplier on line 24 was obtained as the best of the first sixteen
multipliers found by this procedure when $m=2↑{32}$.
%folio 693 galley 3 (C) Addison-Wesley 1978	*
\ansno 12. ${U↓j}↑\prime \cdot {U↓j}↑\prime=U↓j \cdot U↓j+2\sum
↓{i≠j}q↓iU↓i \cdot U↓j + \sum ↓{i≠j}\sum ↓{k≠j}q↓iq↓kU↓i \cdot
U↓k$. The partial derivative with respect to $q↓k$ is twice
the left-hand side of (26). If the minimum can be achieved,
these partial derivatives must all vanish.

\ansno 13. $u↓{11} = 1$, $u↓{21} =$ irrational, $u↓{12} = u↓{22} = 0$.

\ansno 14. After three Euclidean steps we find $\nu↓2↑2
= 5↑2 + 5↑2$, then S4 produces
\def\\#1{\left(\vcenter{\halign{\hfill$##$\quad⊗\hfill$##$\quad⊗\hfill$##$\cr#1}}
\right)}
$$U = \\{-5⊗5⊗0\cr-18⊗-2⊗0\cr1⊗-2⊗1\cr},\qquad
V=\\{-2⊗18⊗38\cr-5⊗-5⊗-5\cr0⊗0⊗100\cr}.$$
\def\¬{\char'403} % asterisk centered on axis
Transformations $(j, q↓1, q↓2, q↓3) = (1, \¬, 0,
2)$, $(2, -4, \¬, 1)$, $(3, 0, 0, \¬)$, $(1, \¬, 0, 0)$ result in
$$U = \\{-3⊗1⊗2\cr-5⊗-8⊗-7\cr1⊗-2⊗1\cr},\qquad V=\\{-22⊗-2⊗18\cr
-5⊗-5⊗-5\cr9⊗-31⊗29\cr},\qquad Z = (0\quad0\quad1).$$
Thus $\nu ↓3 = \sqrt{6}$, as
we already knew from exercise 3.

\ansno 15. The largest achievable $q$ in (11), minus the smallest
achievable, plus 1, is $|u↓1| + \cdots + |u↓t| - \delta $, where
$\delta = 1$ if $u↓iu↓j <
0$ for some $i$ and $j$, otherwise $\delta = 0$. For example
if $t = 5$, $u↓1 > 0$, $u↓3 > 0$, $u↓4 = 0$, and $u↓5 < 0$, the largest
achievable value is $q = u↓1 + u↓2 + u↓3 - 1$ and the smallest
is $q = u↓5 + 1 = -|u↓5| + 1$.

[Note that the number of hyperplanes is unchanged
when $c$ varies, hence the same answer applies to the problem
of covering $L$ instead of $L↓0$. However, the stated formula
is {\sl not} always exact for covering $L↓0$, since the hyperplanes
that intersect the unit hypercube may not all contain points
of $L↓0$. In the example above, we can never achieve the value $q = u↓1
+ u↓2 + u↓3 - 1$ in $L↓0$ if $u↓1 + u↓2 + u↓3 > m$; it is achievable iff
there is a solution to $m - u↓1 - u↓2 - u↓3 = x↓1u↓1 + x↓2u↓2
+ x↓3u↓3 + x↓4|u↓5|$ in nonnegative integers $(x↓1, x↓2, x↓3,
x↓4)$. It may be true that the stated limits are always achievable
when $|u↓1| + \cdots +|u↓t|$ is minimal, but this does not appear
to be obvious.]

\ansno 16. It suffices to determine all solutions to (15) having
minimum $|u↓1| + \cdots +|u↓t|$, subtracting 1 if any one of
these solutions has components of opposite sign.

Instead of positive definite quadratic forms, we
work with $f(x↓1, \ldotss , x↓t) = |x↓1U↓1 + \cdots + x↓tU↓t|$,
defining $|Y| = |y↓1| + \cdots + |y↓t|$. Inequality (21) can
be replaced by $|x↓k| ≤ (\max↓{1≤j≤t}|v↓{kj}|)\,f(y↓1,
\ldotss , y↓t)$.

Thus a workable algorithm can be obtained as follows.
Replace steps S1 through S3 by: ``Set $U ← (m)$, $V ← (1)$, $r ←
1$, $s ← m$, $t ← 1$.'' (Here $U$ and $V$ are $1 \times 1$ matrices;
thus the two-dimensional case will be handled by the general
method. A special procedure for $t = 2$ could, of course, be
devised.) In steps S4 and S8, set $s ← \min(s, |U↓k|)$. In
step S8, set $z↓k ← \lfloor \max↓{1≤j≤t}|v↓{kj}|\,s/m\rfloor$.
In step S10, set $s ← \min(s, |Y| - \delta )$; and in step
S11, output $s = N↓t$. Otherwise leave the algorithm as it stands,
since it already produces suitably short vectors.\xskip [{\sl Math.\
Comp.\ \bf 29} (1975), 827--833.]

\ansno 17. When $k > t$ in S10, and if $Y \cdot Y ≤ s$, output
$Y$ and  $-Y$; furthermore if $Y \cdot
Y < s$, take back the previous output of vectors for this $t$.\xskip
[In the author's experience preparing Table 1, there was
exactly one vector (and its negative) output
for each $\nu ↓t$, except when $y↓1 = 0$ or $y↓t = 0$.]

\ansno 18. (a)\9 Let $x = m$, $y = (1 - m)/3$, $v↓{ij} = y +
x\delta ↓{ij}$, $u↓{ij} = -y + \delta ↓{ij}$. Then $V↓j \cdot
V↓k = {1\over 3}(m↑2 - 1)$ for $j ≠ k$, $V↓k \cdot V↓k = {2\over
3}(m↑2 + {1\over 2})$, $U↓j \cdot U↓j = {1\over 3}(m↑2 + 2)$, $z↓k
\approx \sqrt{2\over 9}\,m$.\xskip $\biglp$This example satisfies
(28) with $a = 1$ and works for all $m ≡ 1 \modulo 3$.$\bigrp$

(b)\9 Interchange the
r\A oles of $U$ and $V$ in step S5. Also set $s ← \min(s, U↓i
\cdot U↓i)$ for all $U↓i$ that change. For example, when $m
= 64$ this transformation with $j = 1$, applied to the matrices
of (a), reduces
$$\chop to 10pt{V = \\{43⊗-21⊗-21\cr-21⊗43⊗-21\cr-21⊗-21⊗43\cr},\quad
U=\\{22⊗21⊗21\cr21⊗22⊗21\cr21⊗21⊗22\cr}}$$
to
$$V=\\{1⊗1⊗1\cr-21⊗43⊗-21\cr-21⊗-21⊗43\cr},\quad
U=\\{22⊗21⊗21\cr-1⊗1⊗0\cr-1⊗0⊗1\cr}.$$
[Since the transformation can increase the
length of $V↓j$, an algorithm that incorporates both transformations
must be careful to avoid infinite looping.]

\ansno 19. No, since a product of non-identity matrices with
all off-diagonal elements nonnegative and all diagonal elements
1 cannot be the identity.

%This paragraph break introduced to fill up a sparse page
[However, looping would be possible
if a subsequent transformation with $q = -1$ were performed
when $-2V↓i \cdot V↓j = V↓j \cdot V↓j$; the rounding rule must
be asymmetric with respect to sign if non-shortening transformations
are allowed.]

\ansno 20. Use the ordinary spectral test for $a$ and $m = 2↑{e-2}$;
cf.\ exercise 3.2.1.2--9.\xskip [On intuitive grounds, the same answer
should apply also when $a \mod 8 = 3$.]

\ansno 21. $X↓{4n+4} ≡ X↓{4n} \modulo 4$, so it is appropriate
to let $V↓1 = (4, 4a↑2, 4a↑3)/m$, $V↓2 = (0, 1, 0)$, $V↓3 = (0,
0, 1)$ define the corresponding lattice $L↓0$.

\ansno 24. Let $m = p$; an analysis paralleling the text can
be given. For example, when $t = 4$ we have $X↓{n+3} = \biglp
(a↑2 + b)X↓{n+1} + abX↓n\bigrp \mod n$, and we want to minimize
$u↓1↑2 + u↓2↑2 + u↓3↑2 + u↓4↑2 ≠
0$ such that $u↓1 + bu↓3 + abu↓4≡ u↓2 + au↓3 + (a↑2 + b)u↓4
≡ 0 \modulo m$.

Replace steps S1 through S3 by the operations of
setting
$$U ← \left({m\atop 0}\quad {0\atop m}\right),\qquad V ← \left({1\quad
0\atop 0\quad 1}\right),\qquad R ← \left({1\quad 0\atop 0\quad 1}\right),\qquad
s ← m↑2,\qquad t ← 2,$$
and outputting $\nu ↓2 = m$. Replace step S4 by

\algstep S4$↑\prime$. [Advance $t$.]
If $t = T$, the algorithm terminates. Otherwise set $t ← t +
1$ and $R ← R\left({0\atop 1}\,{b\atop a}\right)\mod m$. Set $U↓t$
to the new row $(-r↓{12}, -r↓{22}, 0, \ldotss , 0, 1)$ of $t$
elements, and set $u↓{it} ← 0$ for $1 ≤ i < t$. Set $V↓t$ to
the new row $(0, \ldotss , 0, m)$. For $1 ≤ i < t$, set $q ←
\hbox{round}\biglp (v↓{i1}r↓{12} + v↓{i2}r↓{22})/m\bigrp$, $v↓{it}
← v↓{i1}r↓{12} + v↓{i2}r↓{22} - qm$, and $U↓t ← U↓t + qU↓i$.
Finally set $s ← \min(s, U↓t \cdot U↓t)$, $k ← t$, $j ← 1$.

\yskip [A similar generalization applies to all
sequences of length $p↑k-1$ defined by 3.2.2--8. Additional numerical examples
have been given by A. Grube, {\sl Zeitschrift f\"ur angewandte Math.\ und
Mechanik \bf53} (1973), T223--T225.]

\ansno 25. The given sum is at most two times the quantity $\sum↓{0≤k≤m/2d}r(dk)
=1+{1\over d}f(m/d)$, where 
$$\eqalign{\noalign{\vskip-5pt}
f(m)⊗={1\over m}\sum↓{1≤k≤m/2}\csc(πk/m)\cr
⊗={1\over m}\int↓1↑{m/2}\csc(πx/m)\,dx+O\left(1\over m\right)={1\over π}\left.\ln
\tan\left({π\over2m}x\right)\lower9pt\null\right|↓1↑{m/2}
+O\left(1\over m\right).\cr}$$ [When
$d=1$, we have $\sum↓{0≤k<m}r(k)=(2/π)\ln m+1+(2/π)\ln(2e/π)+O(1/m)$.]

\ansno 26. When $m=1$, we cannot use (52) since $k$ will be zero. If
$\gcd(q,m)=d$, the same derivation goes through with $m$ replaced by $m/d$.
Suppose we have $m=p↓1↑{e↓1}\ldotss p↓r↑{e↓r}$ and $\gcd(a-1,m)=p↓1↑{f↓1}\ldotss
p↓r↑{f↓r}$ and $d=p↓1↑{d↓1}\ldotss p↓r↑{d↓r}$. If $m$ is replaced by $m/d$
then $s$ is replaced by
$p↓1↑{\max(0,e↓1-f↓1-d↓1)}\ldotss p↓r↑{\max(0,e↓r-f↓r-d↓r)}$.

\ansno 27. The result is clear if $\nu↓t$ is small, say $\nu↓t<2(t-1)$, because
$\sum r(u↓1,\ldotss,u↓t)$ summed over {\sl all} $u$ is $O\left(\biglp(2/π)\ln
m\bigrp↑t\right)$. 

If $\nu↓t$ is large, it is convenient to use the following functions:
$\rho(x)=1$ if $x=0$, $\rho(x)=x$ if $0<x≤m/2$, $\rho(x)=m-x$ if $m/2<x<m$;
$\hbox{trunc}(x)=\lfloor x/2\rfloor$ if $0≤x≤m/2$, $\hbox{trunc}(x)=m-\lfloor
(m-x)/2\rfloor$ if $m/2<x<m$;
$L(x)=1$ if $x=0$, $L(x)=\lfloor\lg x\rfloor+1$ if $0<x≤m/2$, $L(x)=-\biglp\lfloor
\lg(m-x)\rfloor+1\bigrp$ if $m/2<x<m$; and $l(x)=\max\biglp 1,2↑{|x|-1}\bigrp$.
Note that $l\biglp L(x)\bigrp≤\rho(x)≤2l\biglp L(x)\bigrp$.

Say that a vector $(u↓1,\ldotss,u↓t)$ is {\sl bad} if it is nonzero and satisfies
(15). By hypothesis, all bad vectors satisfy $u↓1↑2+\cdots+u↓t↑2≥\nu↓t↑2≥2(t-1)$,
and it follows that $\rho(u↓1)\ldotsm\rho(u↓t)≥\sqrt{\nu↓t↑2-(t-1)}≥\nu↓t/\sqrt2$.
The vector
$(u↓1,\ldotss,u↓t)$ is said to be in class $\biglp L(u↓1),\ldotss,L(u↓t)\bigrp$;
thus there are at most $(2\lg m+1)↑t$ classes, and class $(L↓1,\ldotss,L↓t)$
contains at most $l(L↓1)\ldotsm l(L↓t)$ vectors. Our proof is based on showing that
the bad vectors in each fixed class contribute only $O(1/\nu↓t)$
to $\sum r(u↓1,\ldotss,r↓t)$.

Let $\mu=\lceil\,\lg\nu↓t/\sqrt2-1\rceil$. The {\sl $\mu$-fold truncation operator}
on a vector is defined to be the following operation repeated $\mu$ times: ``Let
$j$ be minimal such that $\rho(u↓j)>1$, and replace $u↓j$ by $\hbox{trunc}(u↓j)$;
but do nothing if $\rho(u↓j)=1$ for all $j$.'' $\biglp$This
operation essentially throws away one bit of
information about $(u↓1,\ldotss,u↓t)$.$\bigrp$
If $(u↓1↑\prime,\ldotss,u↓t↑\prime)$ and $(u↓1↑{\prime\prime},\ldotss,
u↓t↑{\prime\prime})$ are two vectors of the same class having the same $\mu$-fold
truncation, we say they are {\sl similar}\/; in this case it follows that
$\rho(u↓1↑\prime-u↓1↑{\prime\prime})\ldotsm\rho(u↓t↑\prime-u↓t↑{\prime\prime})≤
2↑\mu<\nu↓t/\sqrt2$. For example, any two vectors of the form $\biglp
(1x↓2x↓1)↓2,0,m-(1x↓3)↓2,(101x↓5x↓4)↓2,(1101)↓2\bigrp$ are similar when $m$
is large and $\mu=5$; the $\mu$-fold truncation operator successively removes
$x↓1$, $x↓2$, $x↓3$, $x↓4$, $x↓5$. Since the difference of two bad vectors
satisfies (15), it is impossible for two unequal bad vectors to be similar.
Therefore class $(L↓1,\ldotss,L↓t)$ can contain at most $\max\biglp 1,
l(L↓1)\ldotsm l(L↓t)/2↑\mu\bigrp$ bad vectors.
If class $(L↓1,\ldotss,L↓t)$ contains exactly one bad vector $(u↓1,\ldotss,u↓t)$,
we have $r(u↓1,\ldotss,u↓t)≤1/\rho(u↓1)\ldotsm\rho(u↓t)≤\sqrt2/\nu↓t$; if it
contains $≤l(L↓1)\ldotsm l(L↓t)/2↑\mu$ bad vectors, each of them has
$r(u↓1,\ldotss,u↓t)≤1/l(L↓1)\ldotsm l(L↓t)$.

\def\\{{\:a[}}\def\¬{{\:a]}}
\\Korobov's test, mentioned at the close of the section, was to find the minimum
of $\rho(u↓1)\ldotsm\rho(u↓t)$ over all bad $(u↓1,\ldotss,u↓t)$.\¬

\ansno 28. Let $\zeta=e↑{2πi/(m-1)}$ and let $S↓{kl}=\sum↓{0≤j<m-1}\omega
↑{x↓{j+l}}\zeta↑{jk}$. The analog of (51) is $|S↓{k0}|=\sqrt m$, hence the
analog of (53) is $\hbox{\:u\char'152}N↑{-1}\sum↓{0≤n<N}\omega↑{x↓n}
\hbox{\:u\char'152}=O\biglp(\sqrt m\log m)/N\bigrp$. The analogous theorem
now states that
$$D↓N↑{(t)}=O\left(\sqrt m(\log m)↑{t+1}\over N\right)+O\left((\log m)↑t/\nu↓t
\right),\qquad D↓{m-1}↑{(t)}=O\left((\log m)↑t\over\nu↓t\right).$$
In fact, $D↓{m-1}↑{(t)}≤{m-2\over m-1}\sum r(u↓1,\ldotss,u↓t)\;$\\summed
over nonzero solutions of (15)\¬$\null+{1\over m-1}\sum r(u↓1,\ldotss,u↓t)\;
$\\summed over all nonzero $(u↓1,\ldotss,u↓t)$\¬.
Let $R(a)=\sum r(u↓1,\ldotss,u↓t)$ summed over nonzero solutions of (15). Since
$m$ is prime, each $(u↓1,\ldotss,u↓t)$ can be a solution to (15) for at most
$t-1$ values of $a$, hence $\sum↓{0<a<m}R(a)≤(t-1)\sum r(u↓1,\ldotss,u↓t)=
O\biglp t(\log m)↑t\bigrp$. It follows that the average value of $R(a)$ taken
oven all $\varphi(m-1)$ primitive roots is $O\biglp t(\log m)↑t/\varphi(m-1)\bigrp$.

$\biglp$In general $1/\varphi(n)=O(\log\log n/n)$; hence
we have proved that {\sl for all
prime $m$ and for all $T$ there exists a primitive root $a$ modulo $m$ such
that the linear congruential sequence $(1,a,0,m)$ has discrepancy $D↓{m-1}
↑{(t)}=O\biglp m↑{-1}T(\log m)↑T\log\log m\bigrp$ for $1≤t≤T$.} This method
of proof does {\sl not} extend to a similar result for linear congruential
generators of period $2↑e$ modulo $2↑e$, since for example the vector
$(1,-3,3,-1)$ solves (15) for about $2↑{2e/3}$ values of $a$.$\bigrp$
%folio 701 galley 4 (C) Addison-Wesley 1978	*
\ansbegin{3.4.1}

\ansno 1. $α + (β - α)U$.

\ansno 2. Let $U = X/m$; $\lfloor kU\rfloor = r$
iff $r ≤ kX/m < r + 1$ iff $mr/k ≤ X < m(r +
1)/k$ iff $\lceil mr/k\rceil ≤ X < \lceil m(r + 1)/k\rceil$.
The exact probability is $(1/m)\biglp\lceil m(r + 1)/k\rceil -
\lceil mr/k\rceil \bigrp = 1/k + \epsilon$, $|\epsilon | < 1/m$.

\ansno 3. If full-word random numbers are given,
the result will be sufficiently random as in exercise 2. But
if a linear congruential sequence is used, $k$ must be relatively
prime to the modulus $m$, lest the numbers have a very short
period, by the results of Section 3.2.1.1. For example,
if $k = 2$ and $m$ is even, the numbers will at
best be alternately 0 and 1.
The method is slower than (1) in nearly every case, so it is
not recommended.

\ansno 4. $\max(X↓1, X↓2) ≤ x$ if and only if $X↓1
≤ x$ and $X↓2 ≤ x$; $\min(X↓1, X↓2) ≥ x$ if and only if $X↓1
≥ x$ and $X↓2 ≥ x$. The probability that two independent events
both happen is the product of the individual probabilities.

\ansno 5. Obtain independent uniform deviates $U↓1$, $U↓2$.
Set $X ← U↓2$. If $U↓1 ≥ p$, set $X ← \max(X, U↓3)$, where
$U↓3$ is a third uniform deviate. If $U↓1 ≥ p + q$, also set
$X ← \max(X, U↓4)$, where $U↓4$ is a fourth uniform deviate.
This method can obviously be generalized to any polynomial,
and indeed even to infinite power series (as shown for example
in Algorithm S\null, which uses minimization instead of maximization).

Alternatively, we could proceed as follows (suggested
by M. D. MacLaren): If $U↓1 < p$, set $X ← U↓1/p$; otherwise
if $U↓1 < p + q$, set $X ← \max\biglp (U↓1 - p)/q, U↓2\bigrp
$; otherwise set $X ← \max\biglp (U↓1 - p - q)/r, U↓2, U↓3\bigrp
$. This method requires less time than the other to obtain the
uniform deviates, although it involves further arithmetical
operations and it is slightly less stable numerically.

\ansno 6. $F(x) = A↓1/(A↓1 + A↓2)$, where $A↓1$
and $A↓2$ are the areas in Fig$.$ A--4; so
$$F(x) = {\int ↑{x}↓0 \sqrt{1 -
y↑2}\,dy\over\int ↑{1}↓{0}\sqrt{1 - y↑2}\, dy}
= {2\over π} \mathop{\hbox{arcsin}} x + {2\over π} x \sqrt{1
- x↑2}.$$
The probability of termination at step 2 is $p
= π/4$, each time step 2 is encountered, so the number of executions
of step 2 has the geometric distribution. The characteristics
of this number are $\biglp$min 1, ave $4/π$, max $∞$, dev $(4/π)
\sqrt{1 - π/4}\bigrp$, by exercise 17.

\topinsert{\vbox to 42mm{\vfill\baselineskip11pt
\hbox to 50mm{\caption Fig.\ A--4. Region of ``acceptance'' for the
algorithm of exercise 6.}}}

\ansno 7. If $k=1$ then $n↓1=n$ and the problem is trivial. Otherwise
it is always possible to find $i≠j$ such that $n↓i≤n≤n↓j$. Fill $B↓i$
with $n↓i$ cubes of color $C↓i$ and $n-n↓i$ of color $C↓j$,
then decrease $n↓j$ by $n-n↓i$ and eliminate color $C↓i$. We are
left with the same sort of problem but with $k$ reduced by 1; by
induction, it's possible.

The following algorithm can be used to compute the $P$ and $Y$ tables: Form
a list of pairs $(p↓1,1)\ldotsm(p↓k,k)$ and sort it by first components,
obtaining a list $(q↓1,a↓1)\ldotsm(q↓k,a↓k)$ where $q↓1≤\cdots≤q↓k$ and $n=k$.
Repeat the following operation until $n=0$: Set $P[a↓1-1]←kq↓1$ and
$Y[a↓1-1]←x↓{a↓n}$. Delete $(q↓1,a↓1)$ and $(q↓n,a↓n)$, then insert the
new entry $\biglp q↓n-(1/k-q↓1),a↓n\bigrp$ into its proper place in the list
and decrease $n$ by 1.

(If $p↓j≤1/k$ the algorithm will never put $x↓j$ in
the $Y$ table; this fact is used implicitly in Algorithm M\null.
The algorithm attempts to maximize the probability that $V<P↓K$ in (3),
by always robbing from the richest remaining element and giving it to the
poorest. However, it is very difficult to determine the absolute minimum of
this probability, since such a task is at least as difficult as the ``bin-packing
problem''; cf.\ Chapter 7.)

\ansno 8. Replace $P↓j$ by $(j+P↓j)/k$ for $0≤j<k$.

\ansno 9. Consider the sign of $f↑{\prime\prime}(x) = 
\sqrt{2/π}\,(x↑2 - 1)e↑{-x↑2/2}$.

\ansno 10. Let $S↓j=(j-1)/5$ for $1≤j≤16$ and $p↓{j+15}=F(S↓{j+1})-
F(S↓j)-p↓j$ for $1≤j≤15$; also let $p↓{31}=1-F(3)$ and $p↓{32}=0$.
$\biglp$Eq.\ (15) defines $p↓1$, $\ldotss$, $p↓{15}$.$\bigrp$
The algorithm of exercise 7 can now be used with $k=32$ to compute
$P↓j$ and $Y↓j$, after which we will have $1≤Y↓j≤15$ for $1≤j≤32$. Set
$P↓0←P↓{32}$ (which is 0) and $Y↓0←Y↓{32}$. Then set $Z↓j←1/(5-5P↓j)$ and
$Y↓j←{1\over5}Y↓j-Z↓j$ for $0≤j<32$; $Q↓j←1/(5P↓j)$ for $1≤j≤15$.

Let $h={1\over5}$ and $f↓{j+15}(x)=\sqrt{2/π}\,\vcenter{\hbox{\:@\char0}}
e↑{-x↑2/2}-e↑{-j↑2/50}\vcenter{\hbox{\:@\char1}}
/p↓{j+15}$
for $S↓j≤x≤S↓j+h$. Then let $a↓j=f↓{j+15}(S↓j)$ for $1≤j≤5$, $b↓j=f↓{j+15}(S↓j)$
for $6≤j≤15$; also $b↓j=-hf↑\prime↓{j+15}(S↓j+h)$ for $1≤j≤5$,
and $a↓j=f↓{j+15}(x↓j)+(x↓j-S↓j)b↓j/h$ for $6≤j≤15$, where $x↓j$ is the root
of the equation $f↓{j+15}↑\prime=-b↓j/h$. Finally set $D↓{j+15}←a↓j/b↓j$ for
$1≤j≤15$ and $E↓{j+15}←25/j$ for $1≤j≤5$, $E↓{j+15}←1/(e↑{(2j-1)/50}-1)$
for $6≤j≤15$.

Table 1 was computed while making use of the following intermediate values:
$(p↓1,\ldotss,p↓{31})=(.156$, .147, .133, .116, .097, .078, .060, .044, .032,
.022, .014, .009, .005, .003, .002, .002, .005, .007, .009, .010, .009, .009,
.008, .006, .005, .004, .002, .002, .001, .001, .003); $(x↓6,\ldotss,x↓{15})
=(1.115$, 1.304, 1.502, 1.700, 1.899, 2.099, 2.298, 2.497, 2.697, 2.896);
$(a↓1,\ldotss,a↓{15})=(7.5$, 9.1, 9.5, 9.8, 9.9, 10.0, 10.0, 10.1, 10.1, 10.1,
10.1, 10.2, 10.2, 10.2, 10.2); $b↓1,\ldotss,b↓{15})=(14.9$, 11.7, 10.9, 10.4,
10.1, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.7, 10.8, 10.9).

\ansno 11. Let $g(t) = e↑{9/2}te↑{-t↑2/2}$ for $t
≥ 3$. Since $G(x) = \int ↑{x}↓{0} g(t)\,dt = 1 - e↑{-(x↑2-9)/2}
$, a random variable $X$ with density $g$ can be computed
by setting $X ← G↑{-1}(1 - V) = \sqrt{9 -
2 \ln V}$. Now $e↑{-t↑2/2} ≤ (t/3)e↑{-t↑2/2}$
for $t ≥ 3$, so we obtain a valid rejection method if we accept
$X$ with probability $f(X)/cg(X) = 3/X$.

\ansno 12. We have $f↑\prime (x) = xf(x) - 1 < 0$ for $x ≥ 0$,
since $f(x) = x↑{-1} - e↑{x↑2/2} \int ↑{∞}↓{x} e↑{-t↑2/2}\,dt/t↑2$
for $x > 0$. Let $x = a↓{j-1}$ and $y↑2 = x↑2
+ 2 \ln 2$; then $$\textstyle\sqrt{2/π} \int ↑{∞}↓{y}
e↑{-t↑2/2}\,dt = {1\over 2}\sqrt{2/π}
\,e↑{-x↑2/2}f(y) < {1\over 2}\sqrt{2/π}
\,e↑{-x↑2/2}f(x) =\dispstyle 2↑{-j},$$ hence $y > a↓j$.

\ansno 13. Take $b↓j = \mu ↓j$; consider now the problem with
$\mu ↓j = 0$ for each $j$. In matrix notation, if $Y = AX$,
where $A = (a↓{ij})$, we need $AA↑T = C = (c↓{ij})$.\xskip$\biglp$In other
notation, if $Y↓j = \sum a↓{jk}X↓k$, then the average value
of $Y↓iY↓j$ is $\sum a↓{ik}a↓{jk}$.$\bigrp$\xskip If this matrix equation
can be solved for $A$, it can be solved when $A$ is triangular,
since $A = BU$ for some orthogonal matrix $U$ and some triangular
$B$, and $BB↑T = C$. The desired triangular solution can be
obtained by solving the equations $a↓{11}↑2 = c↓{11}$,
$a↓{11}a↓{21} = c↓{12}$, $a↓{21}↑2+a↓{22}↑2= c↓{22}$,
$a↓{11}a↓{31} = c↓{13}$, $a↓{21}a↓{31} + a↓{22}a↓{32} = c↓{23}$,
$\ldotss $, successively for $a↓{11}$, $a↓{21}$, $a↓{22}$, $a↓{31}$,
$a↓{32}$, etc.\xskip [{\sl Note: } The covariance matrix must
be positive semidefinite, since the average value of $\biglp\sum y↓jY↓j\bigrp↑2$
is $\sum c↓{ij}y↓iy↓j$, which must be nonnegative. And there
is always a solution when $C$ is positive semidefinite, since
$C = U↑{-1}\hbox{diag}(λ↓1, \ldotss , λ↓n)U$, where the eigenvalues
$λ↓j$ are nonnegative, and $U↑{-1}\hbox{diag}(\sqrt{λ↓1},
\ldotss ,\sqrt{λ↓n})U$ is a solution.]

\ansno 14. $F(x/c)$ if $c > 0$, a step function if $c = 0$,
and $1 - F(x/c)$ if $c < 0$.

\ansno 15. Distribution $\int ↑{∞}↓{-∞} F↓1(x - t)\,dF↓2(t)$.
Density $\int ↑{∞}↓{-∞} f↓1(x - t)f↓2(t)\,dt$. This is called
the {\sl convolution} of the given distributions.
%folio 707 galley 5 (C) Addison-Wesley 1978	*
\ansno 16. It is clear that $f(t) ≤ cg(t)$ for all $t$ as required.
Since $\int ↑{∞}↓{0} g(t)\,dt = 1$ we have $g(t) = Ct↑{a-1}$ for
$0 ≤ t < 1$, $Ce↑{-t}$ for $t ≥ 1$, where $C = ae/(a + e)$. A
random variable with density $g$ is easy to obtain as a mixture
of two distributions, $G↓1(x) = x↑{1/a}$ for $0 ≤ x < 1$, and
$G↓2(x) = 1 - e↑{1-x}$ for $x ≥ 1$:

\yyskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf G1. }\!
[Initialize.]\xskip Set $p ← e/(a + e)$.\xskip
(This is the probability that $G↓1$ should be used.)

\yskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf G2. }\!
[Generate $G$ deviate.]\xskip Generate
independent uniform deviates $U$, $V$, where $V ≠ 0$. If $U <
p$, set $X ← V↑{1/a}$ and $q ← e↑{-X}$; otherwise set $X ← 1
- \ln V$ and $q ← X↑{a-1}$.\xskip (Now $X$ has density $g$, and $q
= f(X)/cg(X)$.)

\yskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf G3. }\!
[Reject?]\xskip Generate a
new uniform deviate $U$. If $U ≥ q$, return to G2.\quad\blackslug

\yyskip\noindent The average number of iterations
is $c = (a + e)/e\Gamma (a + 1) < 1.4$.

It is possible to streamline this procedure in
several ways. First, we can replace $V$ by an exponential deviate
$Y$ of mean 1, generated by Algorithm S\null, say, and then we set
$X ← e↑{-Y/a}$ or $X ← 1 + Y$ in the two cases. Moreover, if
we set $q ← pe↑{-X}$ in the first case and $q ← p + (1 - p)X↑{a-1}$
in the second, we can use the original $U$ instead of a newly
generated one in step G3. Finally if $U < p/e$ we can accept
$V↑{1/a}$ immediately, avoiding the calculation of $q$ about
30 percent of the time.

\ansno 17. (a)\9 $F(x) = 1 - (1 - p)↑{\lfloor x \rfloor}$, for $x ≥ 0$.\xskip
(b) $G(z) = pz/\biglp 1 - (1 - p)z\bigrp$.\xskip (c) Mean $1/p$,
standard deviation $\sqrt{1-p}/p$. To do the
latter calculation, observe that if $H(z) = q + (1 - q)z$, then
$H↑\prime (1) = 1 - q$ and $H↑{\prime\prime} (1) + H↑\prime (1) - \biglp
H↑\prime (1)\bigrp ↑2 = q(1 - q)$, so the mean and variance
of $1/H(z)$ are $q - 1$ and $q(q - 1)$, respectively.\xskip (See Section
1.2.10.)\xskip In this case, $q = 1/p$; the extra factor $z$ in the
numerator of $G(z)$ increases the mean by one.

\ansno 18. Set $N ← N↓1 + N↓2 - 1$, where $N↓1$ and $N↓2$
independently have the geometric distribution for probability
$p$.\xskip (Consider the generating function.)

\ansno 19. Set $N ← N↓1 + \cdots + N↓t - t$, where the $N↓j$
have the geometric distribution for $p$.\xskip (This is the number
of failures before the $t$th success, when a sequence of independent
trials are made each of which succeeds with probability $p$.)

For $t = p = {1\over 2}$, and in general when
the mean value $\biglp$namely $t(1 - p)/p\bigrp$
of the distribution is small, we can simply evaluate the probabilities
$p↓n = {t - 1 + n\choose n}p↑t(1 - p)↑n$ consecutively for $n
= 0$, 1, 2, $\ldots$ as in the following algorithm:

\yyskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf B1. }\!
[Initialize.]\xskip Set $N ← 0$, $q ← p↑t$,
$r ← q$, and generate a random uniform deviate $U$.\xskip (We will
have $q = p↓N$ and $r = p↓0 + \cdots + p↓N$ during this algorithm,
which stops as soon as $U < r$.)

\yskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf B2. }\!
[Search.] If $U ≥ r$, set $N
← N + 1$, $q ← q(1 - p)(t - 1 + N)/N$, $r ← r + q$, and repeat this
step.\quad\blackslug

\yyskip [An interesting technique
for the negative binomial distribution, for arbitrarily large
real values of $t$, has been suggested by R. L\'eger:
First generate a random gamma deviate $X$ of order $t$, then let
$N$ be a random Poisson deviate of mean $X(1 - p)/p$.]

\ansno 20. $\hbox{R1}=1+(1-A/R)\cdot\hbox{R1}$. When R2 is performed,
the algorithm terminates with probability $I/R$; when R3 is performed,
it goes to R1 with probability $E/R$. We have
$$\vbox{\halign{#\qquad⊗\ctr{$#$}\qquad⊗\ctr{$#$}\qquad⊗\ctr{$#$}\qquad
⊗\ctr{$#$}\cr
R1⊗R/A⊗R/A⊗R/A⊗R/A\cr
R2⊗0⊗R/A⊗0⊗R/A\cr
R3⊗0⊗0⊗R/A⊗R/A-I/A\cr
R4⊗R/A⊗R/A-I/A⊗R/A-E/A⊗R/A-I/A-E/A\cr}}$$

\ansno 21. $R=\sqrt{8/e}\approx 1.71153$; $A=\sqrt{π/2}\approx 1.25331$.
Since $$\int u\sqrt{a-bu}\,du=\textstyle
(a-bu)↑{3/2}\left({2\over5}(a-bu)-{2\over3}\right)
/b↑2,$$ we have $I=\int↓0↑{a/b}u\sqrt{a-bu}\,du={4\over15}a↑{5/2}/b↑2$ where $a=4(1+
\ln c)$ and $b=4c$; when $c=e↑{1/4}$, $I$ has its maximum value ${5\over6}\sqrt
{5/e}\approx1.13020$. Finally the following integration formulas are needed for
$E$:$$\eqalign{\int\sqrt{bu-au↑2}\,du⊗
\textstyle={1\over8}b↑2a↑{-3/2}\hbox{arcsin}(2ua/b-1)+
{1\over4}ba↑{-1}\sqrt{bu-au↑2}\,(2ua/b-1),\cr\noalign{\vskip 2pt}
\int\sqrt{bu+au↑2}\,du⊗\textstyle=-{1\over 8}
b↑2a↑{-3/2}\ln\vcenter{\hbox{\:@\char0}}
\sqrt{bu+au↑2}+u\sqrt a+b/2\sqrt a\vcenter{\hbox{\:@\char1}}\cr
⊗\hskip100pt \textstyle\null+{1\over4}ba↑{-1}\sqrt{bu+au↑2}\,
(2ua/b+1),\cr}$$ where $a,b>0$. Let the test in step R3 be ``$X↑2≥4e↑{x-1}/U-4x$'';
then the exterior region hits the top of the rectangle when $u=r(x)=\biglp e↑x
-\sqrt{e↑{2x}-2ex}\bigrp/2ex$.\xskip (Incidentally, $r(x)$ reaches its maximum value
at $x=1/2$, a point where it is {\sl not} differentiable!)\xskip We have $E=\int↓0
↑{r(x)}\biglp\sqrt{2/e}-\sqrt{bu-au↑2}\bigrp\,du$ where $b=4e↑{x-1}$ and
$a=4x$. The maximum value of $E$ occurs near $x=-.35$, where we
have $E\approx.29410$.

\ansno 22. (Solution by G. Marsaglia.)\xskip Consider the ``continuous
Poisson distribution'' $G(x) = \int ↑{∞}↓{\mu } e↑{-t}t↑{x-1}
\,dt/\Gamma (x)$, for $x > 0$; if $X$ has this distribution then
$\lfloor X\rfloor$ is Poisson distributed, since $G(x + 1) -
G(x) = e↑{-\mu}\mu↑x/x!$. If $\mu$ is large, $G$ is approximately
normal, hence $G↑{-1}\biglp F↓\mu(x)\bigrp$ is approximately linear,
where $F↓\mu(x)$ is the distribution function for a normal deviate
with mean and variance $\mu$; i.e., $F↓\mu(x)=F\biglp (x-\mu)/\sqrt\mu\bigrp$,
where $F(x)$ is the normal distribution function (10). Let $g(x)$
be an efficiently computable function such that $|G↑{-1}\biglp
F↓\mu(x)\bigrp - g(x)| < \epsilon$ for $-∞ < x < ∞$; we can now
generate Poisson deviates efficiently as follows: Generate a
normal deviate $X$, and set $Y ← g(\mu+\sqrt\mu\,X)$, $N ← \lfloor Y\rfloor
$, $M ← \lfloor Y + {1\over 2}\rfloor $. If $|Y - M| > \epsilon
$, output $N$; otherwise output $M - 1$ or $M$, according as
$G↑{-1}\biglp F(X)\bigrp < M$ or not.

This approach applies also to the binomial distribution,
with $$G(x) = \int ↑{1}↓{p} u↑{x-1}(1 - u)↑{n-x}\, du \Gamma (t
+ 1)/\Gamma (x)\Gamma (t + 1 - x),$$ since $\lfloor G↑{-1}(U)\rfloor$
is binomial with parameters $(t, p)$ and $G$ is approximately
normal.
%folio 710 galley 6 (C) Addison-Wesley 1978	*
\ansno 23. Yes. The second method calculates $|cos 2\theta
|$, where $\theta$ is uniformly distributed between 0 and $π/2$.\xskip
(Let $U = r \cos \theta$, $V = r \sin \theta$.)

\ansno 25. ${21\over 32} = (.10101)↓2$. In general,
the binary representation is formed by using 1 for $∨$, 0 for
$∧$, from left to right, then suffixing 1. This technique [cf.\
K. D. Tocher, {\sl J. Roy.\ Stat.\ Soc \ \bf B-16} (1954), 49]
can lead to efficient generation of independent bits having
a given probability $p$, and it can also be applied to the geometric and binomial
distributions.

\ansno 26. (a)\9 True, $\sum ↓k\Pr(N↓1 = k)\Pr(N↓2 = n - k)
= e↑{-\mu↓1-\mu↓2}(\mu↓1+\mu↓2)↑n/n!$.\xskip (b)
False, unless $\mu ↓2 = 0$; otherwise $N↓1 - N↓2$ might be
negative.

\ansno 27. Let the binary representation of $p$ be $(.b↓1b↓2b↓3
\ldotsm)↓2$, and proceed as follows.

\yyskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf B1. }\!
[Initialize.]\xskip Set $m
← t$, $N ← 0$, $j ← 1$.\xskip (During this algorithm, $m$ represents the
number of simulated uniform deviates whose relation to $p$ is
still unknown, since they match $p$ in their leading $j - 1$
bits; and $N$ is the number of simulated deviates known to be
less than $p$.)

\yskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf B2. }\!
[Look at next column of bits.]\xskip
Generate a random integer $M$ with the binomial distribution
$(m, {1\over 2})$.\xskip (Now $M$ represents the number of unknown
deviates that fail to match $b↓j$.)\xskip Set $m ← m - M$, and if
$b↓j = 1$ set $N ← N + M$.

\yskip\hangindent 38pt\noindent\hbox to 38pt{\hfill\bf B3. }\!
[Done?]\xskip If $m = 0$ or if the remaining bits $(.b↓{j+1}b↓{j+2}
\ldotsm)↓2$ of $p$ are all zero, the algorithm terminates. Otherwise, set $j ← j
+ 1$ and return to step B2.\quad\blackslug


\yyskip\noindent [When $b↓j = 1$ for infinitely
many $j$, the average number of iterations $A↓t$ satisfies
$$\chop to 12pt{A↓0 = 0;\qquad A↓n = 1 + {1\over 2↑n} \sum ↓{k}{n\choose k}
A↓k,\quad\hbox{for }n ≥ 1.}$$
Letting $A(z) = \sum A↓nz↑n/n!$, we have $A(z)
= e↑z - 1 + A({1\over 2}z)e↑{z/2}$. Therefore $A(z)e↑{-z} = 1 -
e↑{-z} + A({1\over 2}z)e↑{-z/2} = \sum ↓{k≥0}(1 - e↑{-z/2↑k}
) = 1 - e↑{-z} + \sum ↓{n≥1} z↑n(-1)↑{n+1}/\biglp n! (2↑n - 1)\bigrp$,
and
$$A↓m = 1 + \sum ↓{k≥1} {n\choose k} {(-1)↑{k+1}\over
2↑k - 1} = 1 + {V↓{n+1}\over n + 1} = \lg n + {\gamma \over
\ln 2} + {1\over 2} + f↓0(n) + O(n↑{-1})$$
in the notation of exercise 5.2.2--48.]

\baselineskip 13pt %the next exercise wants loose spacing
\ansno 28. Generate a random point $(y↓1, \ldotss , y↓n)$ on
the unit sphere, and let $\rho = \sqrt{\sum
a↓ky↓k↑2}$. Generate an independent uniform deviate $U$,
and if $\rho ↑{n+1}U < K\sqrt{a↓k↑2y
↓k↑2}$, output the point $(y↓1/\rho , \ldotss , y↓n/\rho )$; otherwise
start over. Here $K↑2 = \min\leftset(\sum a↓ky↓k↑2)↑{n+1}/(\sum
a↓k↑2y↓k↑2)\relv \sum y↓k↑2 = 1\rightset = a↑{n-1}↓{n}$
if $na↓n ≥ a↓1$, $\biglp(n + 1)/(a↓1 + a↓n)\bigrp↑{n+1}(a↓1a↓n/n)↑n$ otherwise.

\baselineskip 11pt % resume normal interline spacing
\ansbegin{3.4.2}

\ansno 1. There are $N - t\choose n-m$
ways to pick $n - m$ records from the last $N - t$; $N-t-1\choose n-m-1$
ways to pick $n - m - 1$ from $N -
t - 1$ after selecting the $(t + 1)$st item.

\ansno 2. Step S3 will never go to step S5 when
the number of records left to be examined is equal to $n - m$.

\ansno 3. We should not confuse ``conditional''
and ``unconditional'' probabilities. The quantity $m$ depends
randomly on the selections that took place among the first
$t$ elements; if we take the average over all possible choices
that could have occurred among these elements, we will find that
$(n - m)/(N - t)$ is exactly $n/N$ on the average. For example,
consider the second element; if the first element was selected
in the sample (this happens with probability $n/N$), the second
element is selected with probability $(n - 1)/(N - 1)$; if the
first element was not selected, the second is selected with
probability $n/(N - 1)$. The overall probability of selecting
the second element is $(n/N)\biglp(n - 1)/(N - 1)\bigrp + (1 - n/N)\biglp n/(N
- 1)\bigrp = n/N$.

\ansno 4. From the algorithm,
$$p(m, t + 1) = \left(1 - {n - m\over N - t}\right)p(m, t) + {n -
(m - 1)\over N - t} p(m - 1, t).$$
The desired formula can be proved by induction
on $t$. In particular, $p(n, N) = 1$.

\ansno 5. In the notation of exercise 4, the probability
that $t = k$ at termination is $q↓k = p(n, k) - p(n, k - 1)
= {k-1\choose n-1}/{N\choose n}$. The average is $\sum↓{0≤k≤N}
kq↓k = (N + 1)n/(n + 1)$.

\ansno 6. Similarly, $\sum↓{0≤k≤N}k(k
+ 1)q↓k = (N + 2)(N + 1)n/(n + 2)$; the variance is therefore
$(N + 1)(N - n)n/(n + 2)(n + 1)↑2$.

\ansno 7. Suppose the choice is $1 ≤ x↓1 < x↓2
< \cdots < x↓n ≤ N$. Let $x↓0 = 0$, $x↓{n+1} = N + 1$. The choice
is obtained with probability $p = \prod ↓{1≤t≤N}p↓t$,
where
$$\vbox{\halign{\lft{$\dispstyle p↓t={#}$}\qquad⊗\hbox{for }$#$\hfill\cr
N-(t-1)-n+m\over N-(t-1)⊗x↓m<t<x↓{m+1};\cr
n-m\over N-(t-1)⊗t=x↓{m+1}.\cr}}$$
The denominator of the product $p$ is $N!$; the numerator
contains the terms $N - n$, $N - n - 1$, $\ldotss$, 1 for those
$t$'s that are not $x$'s, and the terms $n$, $n - 1$, $\ldotss$,
1 for those $t$'s that {\sl are} $x$'s. Hence $p = (N - n)!n!/N!$.\xskip
{\sl Example:} $n = 3$, $N = 8$, $(x↓1, x↓2, x↓3) = (2, 3, 7)$; $p
= {5\over8} {3\over 7} {2\over 6} {4\over 5} {3\over 4} {2\over
3} {1\over 2} {1\over 1}$.

\ansno 9. The reservoir gets seven records: 1,
2, 3, 5, 9, 13, 16. The final sample consists of records 2, 5, 16.

\ansno 10. Delete step R6 and the variable $m$. Replace the $I$ table by
a table of records, initialized to the first $n$ records in step R1,
and with the new record replacing the $M$th table entry in step R4.

\ansno 11. Arguing as in Section 1.2.10, which considers the
special case $n = 1$, we see that the generating function is
$$G(z) = z↑n\left({1\over n + 1} + {n\over n + 1} z\right)\left({2\over
n + 2} + {n\over n + 2} z\right)\ldotsm\left({N - n\over N} + {n\over
N} z\right).$$
The mean is $n + \sum ↓{n<t≤N}(n/t) = n(1
+ H↓N - H↓n)$; and the variance is $n(H↓N - H↓n) - n↑2(H↑{(2)}↓{N}
- H↑{(2)}↓{n})$.

\ansno 12. (Note that $π↑{-1} = (b↓tt) \ldotsm (b↓33)(b↓22)$,
so we seek an algorithm that goes from the representation of
$π$ to that for $π↑{-1}$.)\xskip Set $b↓j ← j$ for $1 ≤ j ≤ t$. Then
for $j = 2$, 3, $\ldotss$, $t$ (in this order) interchange $b↓j↔
b↓{a↓j}$. Finally for $j = t$, $\ldotss$, 3, 2 (in this order)
set $b↓{a↓j} ← b↓j$.\xskip $\biglp$The algorithm is based on the fact that
$(a↓tt)π↓1 = π↓1(b↓tt)$.$\bigrp$

\ansno 13. Renumbering the deck 0, 1,
$\ldotss$, $2n - 2$, we find that $s$ takes $x$ into $(2x)\mod(2n -
1)$, $c$ takes $x$ into $(x + 1)\mod(2n - 1)$. We have ($c$ followed
by $s) = cs = sc↑2$. Therefore any product of $c$'s and $s$'s
can be transformed into the form $s↑ic↑k$. Also $2↑{\varphi(2n-1)}
≡ 1$ $\biglp$ modulo $(2n - 1)\bigrp$; since $s↑{\varphi(2n-1)}$
and $c↑{2n-1}$ are the identity permutation, at most $(2n -
1)\varphi (2n - 1)$ arrangements are possible.\xskip (The {\sl exact}
number of different arrangements is $(2n - 1)k$, where $k$ is
the order of 2 modulo $(2n - 1)$. For if $s↑k = c↑j$, then $c↑j$
fixes the card 0, so $s↑k = c↑j =\null$identity.)\xskip For further details,
see {\sl SIAM Review \bf 3} (1961), 293--297.

\ansno 14. Set $Y↓j ← j$ for $t - n < j ≤ t$. Then for $j =
t$, $t - 1$, $\ldotss$, $t - n + 1$ do the following operations: Set
$k ← \lfloor jU\rfloor + 1$. If $k > t - n$ then set $X↓j ←
Y↓k$ and $Y↓k ← Y↓j$; otherwise if $k = X↓i$ for some $i > j$
(a symbol table algorithm could be used), then set $X↓j ← Y↓i$
and $Y↓i ← Y↓j$; otherwise set $X↓j ← k$.\xskip (The idea is to let
$Y↓{t-n+1}$, $\ldotss$, $Y↓j$ represent $X↓{t-n+1}$, $\ldotss$, $X↓j$,
and if $i > j$ and $X↓i ≤ t - n$ also to let $Y↓i$ represent
$X↓{X↓i}$, in the execution of Algorithm P.)

\ansno 15. We may assume that $n≤{1\over2}N$, otherwise it suffices to find
the $N-n$ elements {\sl not\/} in the sample. Using a hash table of size $2n$, the
idea is to generate random numbers between 1 and $N$, storing them in the table
and discarding duplicates, until $n$ distinct numbers have been generated. The
average number of random numbers generated is $N/N+N/(N-1)+\cdots+N/(N-n+1)<2n$,
by exercise 3.3.2--10, and the average time to process each number is $O(1)$.
We want to output the results in increasing order, and this can be done as
follows: Using an ordered hash table (exercise 6.4--66) with linear probing,
the hash table will appear as if the values had been inserted in increasing order.
Thus if we use
a monotonic hash address such as $\lceil nk/N\rceil$ for the key $k$,
it will be a simple matter to output the keys in sorted order by making at most
two passes over the table.
%folio 715 galley 7 (C) Addison-Wesley 1978	*
\ansbegin{3.5}

\ansno 1. A $b$-ary sequence, yes (cf.\
exercise 2); a $[\,0,1)$ sequence, no (since only finitely many
values are assumed by the elements).

\ansno 2. It is 1-distributed and 2-distributed,
but not 3-distributed (the binary number 111 never appears).

\ansno 3. Cf.\ exercise 3.2.2--17; repeat the sequence
there with a period of length 27.

\ansno 4. The sequence begins ${1\over 3}$, ${2\over
3}$, ${2\over 3}$, ${1\over 3}$, ${1\over 3}$, ${1\over 3}$,
${1\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over
3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$,
etc. When $n = 1$, 3, 7, 15, $\ldots$ we have $\nu (n) = 1$, 1,
5, 5, $\ldots$ so that $\nu (2↑{2k-1} - 1) = \nu (2↑{2k} - 1)
= (2↑{2k} - 1)/3$; hence $\nu (n)/n$ oscillates between ${1\over
3}$ and approximately ${2\over 3}$, and no limit exists.
The probability is undefined.

$\biglp$The methods of Section 4.2.4 show, however, that a numerical
value {\sl can} meaningfully be assigned to $$\Pr(U↓n < {1\over
2}) = \Pr (\hbox{leading digit of the radix-4 representation of $n +
1$ is 1}),$$ namely $\log↓4 2 = {1\over 2}$.$\bigrp$

\ansno 5. If $\nu ↓1(n)$, $\nu ↓2(n)$, $\nu ↓3(n)$,
$\nu ↓4(n)$ are the counts corresponding to the four probabilities,
we have $\nu ↓1(n) + \nu ↓2(n) = \nu ↓3(n) + \nu ↓4(n)$ for
all $n$. So the desired result follows by addition of limits.

\def\Pro{\mathop{\overline{\char'120\char'162}}}
\def\Pru{\mathop{\underline{\char'120\char'162}}}
\ansno 6. By exercise 5 and induction,
$$\chop to 9pt{\Pr\biglp S↓j(n)\hbox{ for some }j,\; 1 ≤ j ≤ k\bigrp 
= \sum ↓{1≤j≤k} \Pr\biglp S↓j(n)\bigrp .}$$
As $k → ∞$, the latter is a monotone sequence bounded
by 1, so it converges; and
$$\chop to 9pt{\Pru\biglp S↓j(n)\hbox{ for some }j≥1\bigrp≥\sum↓{1≤j≤k}\Pr\biglp
S↓j(n)\bigrp}$$
for all $k$. For a counterexample to equality,
it is not hard to arrange things so that $S↓j(n)$ is always
true for {\sl some} $j$, yet $\Pr\biglp S↓j(n)\bigrp = 0$ for {\sl
all} $j$.

\ansno 7. Let $p↓i = \sum ↓{j≥1}\Pr\biglp S↓{ij}(n)\bigrp
$. The result of the preceding exercise can be generalized to
$\Pru\biglp S↓j(n)\hbox{ for some }j ≥ 1\bigrp ≥ \sum ↓{j≥1}\Pru\biglp
S↓j(n)\bigrp $, for {\sl any} disjoint statements $S↓j(n)$.
So we have $1 = \Pr\biglp S↓{ij}(n)\hbox{ for some }i, j ≥ 1\bigrp
≥ \sum ↓{i≥1}\Pru\biglp S↓{ij}(n)\hbox{ for some }j ≥ 1\bigrp
≥ \sum ↓{i≥1} p↓i = 1$, and hence $\Pru\biglp S↓{ij}(n)\hbox{ for
some }j ≥ 1\bigrp = p↓i$. Given $\epsilon > 0$, let $I$ be large
enough so that $\sum↓{1≤i≤I}p↓i ≥ 1 - \epsilon $.
Let $$\phi ↓i(N) = (\hbox{number of $n<N$ with $S↓{ij}(n)$ true for
some $j ≥ 1$}\bigrp /N.$$ Clearly $\sum ↓{1≤i≤I}
\phi ↓i(N) ≤ 1$, and for all large enough $N$ we have $\sum ↓{2≤i≤I}
\phi ↓i(N) ≥ \sum ↓{2≤i≤I}p↓i - \epsilon $; hence $\phi
↓1(N) ≤ 1 - \phi ↓2(N) - \cdots - \phi ↓I(N) ≤ 1 - p↓2 - \cdots
- p↓I + \epsilon ≤ 1 - (1 - \epsilon - p↓1) + \epsilon = p↓1
+ 2\epsilon $. This proves that $\Pro\biglp S↓{1j}(n)\hbox{for some }j ≥ 1\bigrp
≤ p↓1 + 2\epsilon $; hence $\Pr\biglp S↓{1j}(n)\hbox{for some }j ≥ 1\bigrp
= p↓1$, and the desired result holds for $i = 1$. By symmetry
of the hypotheses, it holds for any value of $i$.

\ansno 8. Add together the probabilities
for $j$, $j + d$, $j + 2d$, $\ldots$ in Definition E.

\ansno 9. $\limsup↓{n→∞}(a↓n+b↓n)≤\limsup↓{n→∞}a↓n+\limsup↓{n→∞}b↓n$; hence we get
$$\chop to 9pt{\limsup↓{n→∞} \biglp (y↓{1n} - α)↑2 + \cdots + (y↓{mn}
- α)↑2\bigrp ≤ mα↑2 - 2mα↑2 + mα↑2 = 0,}$$
and this can happen only if each $(y↓{jn} - α)$ tends to zero.

\ansno 10. In the evaluation of the sum in Eq$.$ (22).

\ansno 11. $\langle U↓{2n}\rangle$ is $k$-distributed if $\langle
U↓n\rangle$ is $(2, 2k)$-distributed.

\ansno 12. Let $f(x↓1, \ldotss , x↓k) = 1$ if $u ≤ \max(x↓1,
\ldotss , x↓k) < v$; $f(x↓1, \ldotss , x↓k) = 0$ otherwise. Then
apply Theorem B.

\ansno 13. Let
$$\eqalign{p↓k⊗ = \Pr(U↓n \hbox{ begins a gap of length }k-1)\cr
⊗=\Pr\biglp U↓{n-1} \in [α, β),\ U↓n \notin [α, β),\ \ldotss
,\ U↓{n+k-2} \notin [α, β),\ U↓{n+k-1} \in [α,β)\bigrp\cr
⊗=p↑2(1-p)↑{k-1}.\cr}$$
%folio 717 galley 8 (C) Addison-Wesley 1978	*
It remains to translate this into the probability that $f(n)
- f(n - 1) = k$. Let $\nu ↓k(n) = \biglp \hbox{number of $j ≤ n$ with $f(j)
- f(j - 1) = k$}\bigrp $; let $\mu ↓k(n) = ($number of $j ≤ n$
with $U↓j$ the beginning of a gap of length $k - 1)$; and let
$\mu (n)$ similarly count the number of $1 ≤ j ≤ n$ with $U↓j
\in [α, β)$. We have $\mu↓k\biglp f(n)\bigrp=\nu ↓k(n)$,
$\mu\biglp f(n)\bigrp=n$. As $n → ∞$, we must have $f(n)→∞$, hence
$$\nu ↓k(n)/n = \biglp \mu ↓k(f(n))/f(n)\bigrp \cdot \biglp
f(n)/\mu (f(n))\bigrp → p↓k/p = p(1 - p)↑{k-1}.$$
[We have only made use of the fact that the sequence is
$(k + 1)$-distributed.]

\ansno 14. Let
$$\eqalign{p↓k⊗ = \Pr(U↓n\hbox{ begins a run of length }k)\cr\noalign{\vskip3pt}
⊗=\Pr(U↓{n-1} > U↓n < \cdots < U↓{n+k-1} > U↓{n+k})\cr\noalign{\vskip3pt}
⊗= {1\over (k + 2)!}\biggglp{k+2\choose1}{k+1\choose1}-{k+2\choose1}
-{k+2\choose1}+1\bigggrp\cr\noalign{\vskip3pt}
⊗= {k\over (k + 1)!} - {k + 1\over (k + 2)!}\cr}$$
(cf.\ exercise 3.3.2--13). Now proceed as in the previous
exercise to transfer this to $\Pr\biglp f(n) - f(n - 1) = k\bigrp$.\xskip
[We have assumed only that the sequence is $(k + 2)$-distributed.]

\ansno 15. For $s, t ≥ 0$ let
$$\eqalign{p↓{st}⊗ = \Pr(X↓{n-2t-3} = X↓{n-2t-2} ≠ X↓{n-2t-1} ≠
\cdots ≠ X↓{n-1}\cr⊗\hskip 45mm\hbox{and }X↓n= \cdots = X↓{n+s} ≠ X↓{n+s+1}\cr
\noalign{\vskip3pt} ⊗=\textstyle ({1\over 2})↑{s+2t+3};\cr}$$
for $t ≥ 0$ let $q↓t = \Pr(X↓{n-2t-2} = X↓{n-2t-1}
≠ \cdots ≠ X↓{n-1}) = ({1\over 2})↑{2t+1}$. By exercise 7, $\Pr(X↓n$
is not the beginning of a coupon set$) = \sum ↓{t≥0} q↓t = {2\over
3}$; $\Pr(X↓n$ is the beginning of coupon set of length $s + 2) =
\sum ↓{t≥0} p↓{st} = {1\over 3}({1\over 2})↑{s+1}$. Now proceed
as in exercise 13.

\ansno 16. (Solution by R. P. Stanley.)\xskip Whenever the subsequence
$S = (b - 1)$, $(b - 2)$, $\ldotss$, 1, 0, 0, 1, $\ldotss$, $(b - 2)$,
$(b - 1)$ appears, a coupon set must end at the right of $S$,
since some coupon set is completed in the first half of $S$.
We now proceed to calculate the probability that a coupon set
begins at position $n$ by manipulating the probabilities that
the last prior appearance of $S$ ends at position $n - 1$, $n
- 2$, etc., as in exercise 15.

\ansno 18. Proceed as in the proof of Theorem A to calculate
$\underline{\hbox{Pr}}$ and $\overline{\hbox{Pr}}$.

\ansno 19. (Solution by T. Herzog.)\xskip Yes; e.g., the sequence
$\langle U↓{\lfloor n/2\rfloor}\rangle$ when $\langle U↓n\rangle$
satsifies R4 (or even its weaker version), cf.\ exercise 33.

\ansno 21. $\Pr(Z↓n \in M↓1, \ldotss , Z↓{n+k-1} \in M↓k) = p(M↓1)
\ldotsm p(M↓k)$, for all $M↓1$, $\ldotss$, $M↓k \in \Mscr$.

\ansno 22. If the sequence is $k$-distributed, the limit is
zero by integration and Theorem B\null. Conversely, note that if
$f(x↓1, \ldotss , x↓k)$ has an absolutely convergent Fourier
series
$$f(x↓1, \ldotss , x↓k) = \sum ↓{-∞<c↓1, \ldotss , c↓k<∞} a(c↓1,
\ldotss , c↓k)\exp\biglp2πi(c↓1x↓1 + \cdots + c↓kx↓k)\bigrp ,$$
we have $\lim↓{N→∞} {1\over N} \sum ↓{0≤n<N} f(U↓n,
\ldotss , U↓{n+k-1}) = a(0, \ldotss , 0) + \epsilon ↓r$, where
$$|\epsilon ↓r| ≤ \sum ↓{|c↓1|, \ldotss , |c↓k|>r} |a(c↓1, \ldotss, c↓k)|,$$
so $\epsilon ↓r$ can be made arbitrarily small.
Hence this limit is equal to $$a(0, \ldotss , 0) = \int ↑{1}↓{0}
\cdots \int ↑{1}↓{0} f(x↓1, \ldotss , x↓k)\, dx↓1 \ldotsm dx↓k,$$ and
Eq$.$ (8) holds for all sufficiently smooth functions $f$.
The remainder of the proof shows that the function in (9) can
be approximated by smooth functions to any desired accuracy.

\ansno 23. See {\sl AMM \bf 75} (1968), 260--264.

\ansno 24. This follows immediately from exercise 22.

\ansno 25. If the sequence is equidistributed, the denominator
in Corollary S approaches ${1\over 12}$, and the numerator approaches
the quantity in this exercise.

\ansno 26. See {\sl Math$.$ Comp$.$ \bf 17} (1963), 50--54.\xskip [Consider
also the following example by A. G. Waterman: Let $\langle U↓n\rangle$
be an equidistributed $[\,0, 1)$ sequence and $\langle X↓n\rangle$
an $∞$-distributed binary sequence. Let $V↓n = U↓{\lceil\sqrt n\,\rceil}$
or $1 - U↓{\lceil\sqrt n\,\rceil}$ according as $X↓n$ is 0 or 1. Then
$\langle V↓n\rangle$ is equidistributed and white, but $\Pr(V↓n
= V↓{n+1}) = {1\over 2}$. Let $W↓n = (V↓n-ε↓n)\mod 1$
where $\langle \epsilon ↓n\rangle$ is any sequence that decreases
monotonically to 0; then $\langle W↓n\rangle$ is equidistributed
and white, yet $\Pr(W↓n < W↓{n+1}) = {3\over 4}$.]

\ansno 28. Let $\langle U↓n\rangle$ be $∞$-distributed, and
consider the sequence $\langle {1\over 2}(X↓n + U↓n)\rangle $.
This is 3-distributed, using the fact that $\langle U↓n\rangle$
is $(16, 3)$-distributed.

\ansno 29. If $x = x↓1x↓2 \ldotsm x↓t$ is any binary number,
we can consider the number $\nu ↑{E}↓{x}(n)$ of times $X↓p \ldotsm
X↓{p+t-1} = x$, where $1 ≤ p ≤ n$ and $p$ is even. Similarly,
let $\nu ↑{O}↓{x}(n)$ count the number of times when $p$ is
odd. Let $\nu ↑{E}↓{x}(n) + \nu ↑{O}↓{x}(n)=\nu↓x(n)$. Now
\def\\{\char'403 }
$$\nu ↑{E}↓{0}(n) = \sum \nu ↑{E}↓{0\\\\\ldots\\}(n) \approx
\sum \nu ↑{O}↓{\\0\\\ldots\\}(n)
\approx \sum \nu ↑{E}↓{\\\\0\ldots\\}
(n) \approx \cdots \approx \sum \nu ↑{O}↓{\\\\\\\ldots0}(n)$$
where the $\nu $'s in these summations have $2k$
subscripts, $2k-1$ of which are asterisks (meaning that they are being
summed over---each sum is taken over $2↑{2k-1}$ combinations of zeros and ones),
and where ``$\approx$'' denotes
approximate equality (except for an error of at most $2k$ due
to end conditions). Therefore we find that
$$\twoline{{1\over n} 2k\nu ↑{E}↓{0}(n) = {1\over n} \left(\sum
\nu ↓{\\0\\\ldots\\}(n)+\cdots+\sum\nu↓{\\\\\\\ldots0}(n)\right)}{3pt}{
\null+{1\over n}
\chop to 9pt{\sum↓x\biglp r(x)-s(x)\bigrp\nu↓x↑E(n)}+O\left(1\over n\right),}$$
where $x = x↓1 \ldotsm x↓{2k}$ contains $r(x)$ zeros
in odd positions and $s(x)$ zeros in even positions. By $(2k)$-distribution,
the parenthesized quantity tends to $k(2↑{2k-1})/2↑{2k} = k/2$.
The remaining sum is clearly a maximum if $\nu ↑{E}↓{x}(n) =
\nu ↓x(n)$ when $r(x) > s(x)$, and $\nu ↑{E}↓{x}(n) = 0$ when
$r(x) < s(x)$. So the maximum of the right-hand side becomes
$${k\over 2} + \sum ↓{0≤s<r≤k} (r - s)\left.{k\choose r}{k\choose s}\right/
2↑{2k} = {k\over 2} + k\left.{2k-1\choose k}\right/2↑{2k}.$$
%folio 720 galley 9 (C) Addison-Wesley 1978	*
Now $\overline{\hbox{Pr}}(X↓{2n} = 0) ≤ \limsup↓{n→∞}\nu ↑{E}↓{0}(2n)/n$,
so the proof is complete.\xskip {\sl Note:}
$$\eqalign{\noalign{\vskip 4pt}
\sum ↓{r,s}{n\choose r}{n\choose s}\max(r,s)⊗=
2n2↑{2n-2} + n{2n-1\choose n};\cr
\sum ↓{r,s}{n\choose r}{n\choose s}\min(r,s)⊗=
2n2↑{2n-2} - n{2n-1\choose n}.\cr}$$

\ansno 30. Let $f(x↓1, x↓2, \ldotss , x↓{2k}) = \hbox{sign}
(x↓1 - x↓2 + x↓3 - x↓4 + \cdots - x↓{2k})$. Construct a
directed graph with $2↑{2k}$ nodes labeled $(E; x↓1, \ldotss
, x↓{2k-1})$ and $(O; x↓1, \ldotss , x↓{2k-1})$, where each $x$
is either 0 or 1. Let there be $1 + f(x↓1, x↓2, \ldotss , x↓{2k})$
directed arcs from $(E; x↓1, \ldotss , x↓{2k-1})$ to $(O; x↓2,
\ldotss , x↓{2k})$, and $1 - f(x↓1, x↓2, \ldotss , x↓{2k})$ directed
arcs leading from $(O; x↓1, \ldotss , x↓{2k-1})$ to $(E; x↓2,
\ldotss , x↓{2k})$. We find that each node has the same number of
arcs leading into it as those leading out; for example, $(E;
x↓1, \ldotss , x↓{2k-1})$ has $1 - f(0, x↓1, \ldotss , x↓{2k-1})
+ 1 - f(1, x↓1, \ldotss , x↓{2k-1})$ leading in and $1 + f(x↓1,
\ldotss , x↓{2k-1}, 0) + 1 + f(x↓1, \ldotss , x↓{2k-1}, 1)$ arcs
leading out, and $f(x, x↓1, \ldotss , x↓{2k-1}) = -f(x↓1, \ldotss
, x↓{2k-1}, x)$. Drop all nodes that have no paths leading
either in or out, i.e., $(E; x↓1, \ldotss , x↓{2k-1})$ if $f(0,
x↓1, \ldotss , x↓{2k-1}) = +1$, or $(O; x↓1, \ldotss , x↓{2k-1})$
if $f(1, x↓1, \ldotss , x↓{2k-1}) = -1$. The resulting directed
graph is seen to be connected, since we can get from any node
to $(E; 1, 0, 1, 0, \ldotss , 1)$ and from this to any desired
node. By Theorem 2.3.4.2G\null, there is a cyclic path traversing
each arc; this path has length $2↑{2k+1}$, and we may assume
that it starts at node $(E; 0, \ldotss , 0)$. Construct a cyclic
sequence with $X↓1 = \cdots = X↓{2k-1} = 0$, and $X↓{n+2k-1}
= x↓{2k}$ if the $n$th arc of the path is from $(E; x↓1, \ldotss
, x↓{2k-1})$ to $(O; x↓2, \ldotss , x↓{2k})$ or from $(O; x↓1,
\ldotss , x↓{2k-1})$ to $(E; x↓2, \ldotss , x↓{2k})$. For example,
the graph for $k = 2$ is shown in Fig.\ A--5; the arcs of the
cyclic path are numbered from 1 to 32, and the cyclic sequence
is (00001000110010101001101110111110)(00001$\ldotsm$). Note that
$\Pr(X↓{2n} = 0) = {11\over 16}$ in this sequence. The sequence
is clearly $(2k)$-distributed, since each $(2k)$-tuple $x↓1x↓2
\ldotsm x↓{2k}$ occurs $1 + f(x↓1, \ldotss , x↓{2k}) + 1 - f(x↓1,
\ldotss , x↓{2k}) =2$ times in the cycle. The fact that $\Pr(X↓{2n}
= 0)$ has the desired value comes from the fact that the maximum
value on the right-hand side in the proof of the preceding exercise
has been achieved by this construction.

\topinsert{\vskip 63mm
\ctrline{\caption Fig.\ A--5. Directed graph for the
construction in exercise 30.}}

\ansno 31. Use Algorithm W with rule $\Rscr↓1$ selecting the entire sequence.\xskip
[For a generalization of this
type of nonrandom behavior in R5-sequences, see Jean Ville,
{\sl \'Etude Critique de la notion de Collectif} (Paris,
1939), 55--62. Perhaps R6 is also too weak, from this standpoint.]

\ansno 32. If $\Rscr,\Rscr↑\prime$ are computable subsequence rules,
so is $\Rscr↑{\prime\prime} = \Rscr\Rscr↑\prime$ defined by the following
functions: $f↓n↑{\prime\prime}(x↓0, \ldotss , x↓{n-1}) = 1$ iff
$\Rscr$ defines the subsequence $x↓{r↓1}$, $\ldotss
$, $x↓{r↓k}$ of $x↓0$, $\ldotss$, $x↓{n-1}$, where $k ≥ 0$ and
$0 ≤ r↓1 < \cdots < r↓k < n$ and $f↓k↑\prime (x↓{r↓1},
\ldotss, x↓{r↓k}) = 1$.

Now $\langle X↓n\rangle\Rscr\Rscr↑\prime$ is $\biglp \langle
X↓n\rangle\Rscr\bigrp\Rscr↑\prime $. The result follows immediately.

\ansno 33. Given $\epsilon > 0$, find $N↓0$ such that $N > N↓0$
implies that both $|\nu ↓r(N)/N - p| < \epsilon$ and $|\nu ↓s(N)/N
- p| < \epsilon $. Then find $N↓1$ such that $N > N↓1$ implies
that $t↓N$ is $r↓M$ or $s↓M$ for some $M > N↓0$. Now $N > N↓1$
implies that
$$\left| {\nu ↓t(N)\over N} - p\right| = \left| {\nu ↓r(N↓r)
+ \nu ↓s(N↓s)\over N} - p\right| = \left| {\nu ↓r(N↓r) - pN↓r
+ \nu ↓s(N↓s) - pN↓s\over N↓r + N↓s}\right|<ε.$$

\ansno 34. For example, if the binary representation
of $t$ is $(1\ 0↑{b-2}\ 1\ 0↑{a↓1}\ 1\ 1\ 0↑{a↓2}\ 1\ \ldots\
1\ 0↑{a↓k})↓2$, where ``$0↑a$'' stands for a sequence of $a$
consecutive zeros, let the rule $\Rscr↓t$ accept $U↓n$ if and
only if $\lfloor bU↓{n-k}\rfloor = a↓1$, $\ldotss$, $\lfloor bU↓{n-1}\rfloor
= a↓k$.

\ansno 35. Let $a↓0 = s↓0$ and $a↓{m+1} = \max\leftset s↓k \relv 0 ≤ k
< 2↑{a↓m}\rightset$. Construct a subsequence rule that selects
element $X↓n$ if and only if $n = s↓k$ for some $k < 2↑{a↓m}$,
when $a↓m ≤ n < a↓{m+1}$. Then $\lim↓{m→∞}\nu (a↓m)/a↓m
= {1\over 2}$.

\ansno 36. Let $b$ and $k$ be arbitrary but fixed integers greater
than 1. Let $Y↓n = \lfloor bU↓n\rfloor $. An arbitrary infinite
subsequence $\langle Z↓n\rangle = \langle Y↓{s↓n}\rangle\Rscr$
determined by algorithms $\Sscr$ and $\Rscr$ (as in the proof of
Theorem M) corresponds in a straightforward but notationally
hopeless manner to algorithms $\Sscr↑\prime$ and $\Rscr↑\prime$ that
inspect $X↓t$, $X↓{t+1}$, $\ldotss$, $X↓{t+s}$ and/or select $X↓t$,
$X↓{t+1}$, $\ldotss$, $X↓{t+\min(k-1,s)}$ of $\langle X↓n\rangle$
if and only if $\Sscr$ and $\Rscr$ inspect and/or select $Y↓s$, where
$U↓s = (0.X↓tX↓{t+1} \ldotsm X↓{t+s})↓2$. Algorithms $\Sscr↑\prime$ and
$\Rscr↑\prime$ determine an infinite 1-distributed subsequence of
$\langle X↓n\rangle$ and in fact (as in exercise 32) this
subsequence is $∞$-distributed so it is $(k, 1)$-distributed.
Hence we find that $\underline{\hbox{Pr}}(Z↓n = a)$ and
$\overline{\hbox{Pr}}(Z↓n=a)$ differ from $1/b$ by less than $1/2↑k$.

[The result of this exercise is true if ``R6'' is replaced
consistently by ``R4'' or ``R5''; but it is false if ``R1'' is used, since
$X↓{n\choose2}$ might be identically zero.]

\ansno 37. For $n ≥ 2$ replace $U↓{n↑2}$ by ${1\over 2}(U↓{n↑2}
+ \delta ↓n)$, where $\delta ↓n = 0$ or 1 according as the set $\{U↓{(n-1)↑2+1},
\ldotss , U↓{n↑2-1}\}$ contains an even or odd
number of elements less than ${1\over 2}$.\xskip [{\sl Advances in
Math$.$ \bf 14} (1974), 333--334.]

\ansno 39. See {\sl Acta Arithmetica \bf 21} (1972), 45--50.
The best possible value of $c$ is unknown.

\ansno 40. If every one-digit change to a random table yields
a random table, all tables are random (or none are). If we don't
allow degrees of randomness, the answer must therefore be, ``Not
always.''

%folio 722 galley 10a (C) Addison-Wesley 1978	*
\ansbegin{3.6}

\mixans 1. {⊗RANDI⊗STJ⊗9F⊗Store exit location.\cr
⊗⊗STA⊗8F⊗Store value of $k$.\cr
⊗⊗LDA⊗XRAND⊗$\rA ← X$.\cr
⊗⊗MUL⊗7F⊗$\rAX ← aX$.\cr
⊗⊗INCX⊗1009⊗$\rX ← (aX + c)\mod m$.\cr
⊗⊗SLAX⊗5⊗$\rA ← (aX+c)\mod m$.\cr
⊗⊗STA⊗XRAND⊗Store $X$.\cr
⊗⊗MUL⊗8F⊗$\rA ← \lfloor kX/m\rfloor$.\cr
⊗⊗INCA⊗1⊗Add 1, so that $1 ≤ Y ≤k$.\cr
⊗9H⊗JMP⊗*⊗Return.\cr
⊗XRAND⊗CON⊗1⊗Value of $X$; $X↓0 = 1$.\cr
⊗8H⊗CON⊗0⊗Temp storage of $k$.\cr
⊗7H⊗CON⊗3141592621⊗The multiplier $a$.\quad\blackslug\cr}

\ansno 2. Putting a random-number generator into a program
makes the results essentially unpredictable to the programmer.
If the behavior of the machine on each problem were known in
advance, few programs would ever be written. As Turing has said,
the actions of a computer quite often {\sl do} surprise its programmer,
especially when a program is being debugged.

So the world had better watch out.
%folio 723 galley 10b (C) Addison-Wesley 1978	*
\ansbegin{4.1}

\ansno 1. 1010, 1011, 1000, $\ldotss$,
11000, 11001, 11110.

\ansno 2. (a)\9 $-110001$, $-11.001001001001 \ldotss
$, $11.0010010000111111 \ldotss\,$.

(b)\9 $11010011$, $1101.001011001011 \ldotss$ , $111.011001000100000
\ldotss\,$.

\def\≡{\overline1}
(c)\9 $\≡11\≡\≡$, $\≡0.0\≡\≡0110\≡\≡011 \ldotss
$, $10.011\≡111\≡ \ldotss\,$.

(d)\9 $-9.4$, $- \ldotsm 7582417582413$, $\ldotss 562951413$.

\ansno 3. $(1010113.2)↓{2i}$.

\ansno 4. (a)\9 Between rA and rX.\xskip (b) The remainder
in rX has radix point between bytes 3 and 4; the quotient in
rA has radix point one byte to the right of the least significant
portion of the register.

\ansno 5. It has been subtracted from $999 \ldotsm
9 = 10↑p - 1$, instead of from $1000 \ldotsm 0 = 10↑p$.

\ansno 6. (a,c)\9 $2↑{p-1} - 1$, $-(2↑{p-1} - 1)$;\xskip (b) $2↑{p-1}
- 1$, $-2↑{p-1}$.

\ansno 7. A ten's complement representation for
a negative number $x$ can be obtained by considering $10↑n +
x$ (where $n$ is large enough for this to be positive) and extending
it on the left with infinitely many nines. The nines' complement
representation can be obtained in the usual manner.\xskip (These two
representations are equal for nonterminating decimals, otherwise
the nines' complement representation has the form $\ldotss (a)99999
\ldots$ while the ten's complement representation has the form
$\ldotss (a + 1)0000 \ldotss\,$.) The representations may be
considered sensible if we regard the value of the infinite sum
$N = 9 + 90 + 900 + 9000 + \cdots$ as $-1$, since $N
- 10N = 9$.

See also exercise 31, which considers $p$-adic
number systems. The latter agree with the $p$'s complement notations
considered here, for numbers whose radix-$p$ representation
is terminating, but there is no simple relation between the
field of $p$-adic numbers and the field of real numbers.

\ansno 8. $\sum ↓j a↓jb↑j = \sum ↓j (a↓{kj+k-1}b↑{k-1}
+ \cdots + a↓{kj})b↑{kj}$.

\ansno 9. {\tt A BAD AD0BE FACADE FADED}.\xskip [{\sl Note:}
Other possible ``number sentences'' would be \.{D0} \.{A} \.{DEED} \.{A}
\.{DECADE}; \.{A} \.{CAD} \.{FED} \.{A} \.{BABE} \.{BEEF},
\.{C0C0A}, \.{C0FFEE}; \.{B0B} \.{FACED} \.{A} \.{DEAD} \.{D0D0}.]
%folio 725 galley 1 (C) Addison-Wesley 1978	*
\ansno 10. $\dispstyle
\left[\vcenter{\halign{\rt{$#$}⊗\rt{$\,#$}⊗\rt{$\,#$}⊗\rt{$\,
#$}⊗\rt{$\,#$}⊗\rt{$\;#$}⊗\rt{$\;#\ldotsm$}\cr
\ldotsm,⊗a↓3,⊗a↓2,⊗a↓1,⊗a↓0;⊗a↓{-1},⊗a↓{-2},\cr
\ldotsm,⊗b↓3,⊗b↓2,⊗b↓1,⊗b↓0;⊗b↓{-1},⊗b↓{-2},\cr}}\right]=
\left[\vcenter{\halign{\rt{$#$}⊗\rt{$\,#$}⊗\rt{$\,#$}⊗\rt{$\,
#$}⊗\rt{$\,#$}⊗\rt{$\;#$}⊗\rt{$\;#\ldotsm$}\cr
\ldotsm,⊗A↓3,⊗A↓2,⊗A↓1,⊗A↓0;⊗A↓{-1},⊗A↓{-2},\cr
\ldotsm,⊗B↓3,⊗B↓2,⊗B↓1,⊗B↓0;⊗B↓{-1},⊗B↓{-2},\cr}}\right]$, if
$$A↓j=\left[\vcenter{\halign{\rt{$# $}⊗\rt{$\,#,\ldotss,$}⊗\rt{$\,#$}\cr
a↓{k↓{j+1}-1},⊗a↓{k↓{j+1}-2}⊗a↓{k↓j}\cr
⊗b↓{k↓{j+1}-2}⊗b↓{k↓j}\cr}}\right],\qquad B↓j=b↓{k↓{j+1}-1}\ldotsm b↓{k↓j},$$
where $\langle k↓n\rangle$ is any infinite sequence
of integers with $k↓{j+1} > k↓j$.

\ansno 11. (The following algorithm works both for addition
or subtraction, depending on whether the plus or minus sign
is chosen.)

Start by setting $k ← a↓{n+1} ← a↓{n+2} ← b↓{n+1}
← b↓{n+2} ← 0$; then for $m = 0$, 1, $\ldotss$, $n + 2$ do the following:
Set $c↓m ← a↓m \pm b↓m + k$; then if $c↓m ≥ 2$, set $k ← -1$
and $c↓m ← c↓m - 2$; otherwise if $c↓m < 0$, set $k ← 1$ and $c↓m ← c↓m
+ 2$; otherwise (i.e., if $0 ≤ c↓m ≤ 1$), set $k ← 0$.

\ansno 12. (a)\9 Subtract $\pm(\ldotsm a↓30a↓10)↓{-2}$ from $\pm
(\ldotsm a↓40a↓20a↓0)↓{-2}$
in the negabinary system.\xskip (See also exercise 7.1--18 for a trickier
solution.)\xskip (b) Subtract $(\ldotsm b↓30b↓10)↓2$
from $(\ldotsm b↓40b↓20b↓0)↓2$ in the binary system.

\ansno 13. $(1.909090\ldotsm)↓{-10} = (0.090909\ldotsm)↓{-10} = {1\over 11}$.

\ansno 14. \hbox to 329pt{$\def\\{\hskip 4.5pt}\hskip0pt plus 200pt
\eqalign{1\\1\\3\\2\\1⊗\qquad[5-4i]\cr
1\\1\\3\\2\\1⊗\qquad[5-4i]\cr
\noalign{\vskip -7.6pt}
\vbox{\hrule width 40.5pt}\cr
\noalign{\vskip-1.75pt}
1\\1\\3\\2\\1\cr
1\\1\\2\\0\\2\\\\\cr
1\\2\\1\\2\\3\\\\\\\\\cr
1\\1\\3\\2\\1\\\\\\\\\\\\\cr
1\\1\\3\\2\\1\\\\\\\\\\\\\\\\\cr
\noalign{\vskip 3pt\hrule width 76.5pt\vskip 3pt}
0\\1\\0\\3\\1\\1\\2\\0\\1⊗\qquad[9-40i]\cr}\hskip0pt plus 200pt$}

\ansno 15. $[-{10\over 11}, {1\over 11}]$, and the rectangle
shown in Fig.\ A--6.

\topinsert{\vskip 40mm 
\moveright 190pt\hbox to 128pt{\caption Fig.\ A--6.
Fundamental region for quater-imaginary numbers.}}

\ansno 16. It is tempting to try to do this in a very simple
way, by using the rule $2 = (1100)↓{i-1}$ to take care of carries;
but that leads to a nonterminating method if, for example, we try to add 1
to $(11101)↓{i-1} = -1$.

The following solution does the job by providing
four related algorithms (namely for adding or subtracting 1
or $i$). If $α$ is a string of zeros and ones, let $α↑P$ be
a string of zeros and ones such that $(α↑P)↓{i-1} = (α)↓{i-1}
+ 1$; and let $α↑{-P}$, $α↑Q$, $α↑{-Q}$ be defined similarly, with
$-1$, $+i$, and $-i$ respectively in place of $+1$. Then
$$\eqalign{(α0)↑P⊗=α1;\cr(αx0)↑{-P}⊗=α↑{-Q}x1;\cr}\quad
\eqalign{(αx1)↑P⊗=α↑Qx0.\cr(α1)↑{-P}⊗=α0.\cr}\qquad\qquad
\eqalign{(α0)↑Q⊗=α↑P1;\cr(α0)↑{-Q}⊗=α↑Q1;\cr}\quad
\eqalign{(α1)↑Q⊗=α↑{-Q}0.\cr(α1)↑{-Q}⊗=α↑{-P}0.\cr}$$
Here $x$ stands for either 0 or 1, and the strings are
extended on the left with zeros if necessary. The processes
will clearly always terminate. Hence every number of the form
$a + bi$ with $a$ and $b$ integers is representable in the $i - 1$
system.

\ansno 17. No (in spite of exercise 28); the number $-1$ cannot be so represented.
This can be proved by constructing a set $S$ as in Fig.\ 1. We do have
the representations $-i = (0.1111\ldotsm)↓{1+i}$, $i = (100.1111\ldotsm
)↓{1+i}$.

\lineskip 0pt
\ansno 18. Let $S↓0$ be the set of points $(a↓7a↓6a↓5a↓4a↓3a↓2a↓1a↓0)↓{i-1}$,
where each $a↓k$ is 0 or 1.\xskip (Thus, $S↓0$ is given by the 256 interior
dots shown in Fig.\ 1, if that picture is multiplied by 16.)\xskip
We first show that $S$ is closed: If $y↓1$, $y↓2$, $\ldots$ is an
infinite subset of $S$, we have $y↓n = \sum ↓{k≥1\lower 1pt\null} a↓{nk}16↑{-k}$,
where each $a↓{nk}$ is in $S↓0$. Construct a tree whose nodes
are $(a↓{n1}, \ldotss , a↓{nr})$, for $1 ≤ r ≤ n$, and let a
node of this tree be an ancestor of another node if it is an
initial subsequence of that node. By the infinity lemma this
tree has an infinite path $(a↓1, a↓2, a↓3, \ldotss)$, and it follows that $\sum
↓{k≥1} a↓k16↑{-k}$ is a limit point of $\{y↓1, y↓2, \ldotss\}$
in $S$.

\lineskip 1pt
By the answer to exercise 16, all numbers of the
form $(a + bi)/16↑k$ are representable, when $a$ and $b$ are
integers. Therefore if $x$ and $y$ are arbitrary reals and $k
≥ 1$, the number $z↓k = (\lfloor 16↑kx\rfloor + \lfloor 16↑ky\rfloor
i)/16↑k$ is in $S + m + ni$ for some integers $m$ and $n$. It
can be shown that $S + m + ni$ is bounded away from the origin
when $(m, n) ≠ (0, 0)$. Consequently if $|x|$ and $|y|$ are
fixed and $k$ is sufficiently large, we have $z↓k
\in S$, and $\lim↓{k→∞}z↓k = x + yi$ is in $S$.

\ansno 19. If $m > u$ or $m < l$, find $a \in D$ such that $m
≡ a \modulo b$; the desired representation will be a representation
of $m↑\prime = (m - a)/b$ followed by $a$. Note that $m > u$
implies $l < m↑\prime < m$; $m < l$ implies $m < m↑\prime < u$;
so the algorithm terminates.

[There are no solutions when $b = 2$. The representation
will be unique iff $0 \in D$; nonunique representation occurs
for example when $D = \{-3, -1, 7\}$, $b = 3$, since $(α)↓3 =
(\overline377\overline3α)↓3$. When $b ≥ 3$ it is not difficult
to show that there are exactly $2↑{b-3}$ solution sets $D$ in
which $|a| < b$ for all $a \in D$. Furthermore the set $D =
\{0$, 1, $2 - ε↓2b↑n$, $3 - ε↓3b↑n$, $\ldotss$, $b - 2 - ε↓{b-2}b↑n$, $b
- 1 - b↑n\}$ gives unique representations, for all $b ≥ 3$ and
$n ≥ 1$, when each $ε↓j$ is 0 or 1. Reference: {\sl Proc.\ IEEE Symp.\ Comp.\
Arith.\ \bf4} (1978), 1--9.]

\ansno 20. (a)\9 $0.\overline1\overline1\overline1\,\ldots=\overline1.888\,
\ldots=\overline18.{111\atop777}\,\ldots=\overline18{1\atop7}.{222\atop666}\,
\ldots=\cdots=\overline18{123456\atop765432}.{777\atop111}\,\ldots$ has
nine representations.\xskip (b) A ``$D$-fraction'' $.a↓1a↓2\,\ldots$
always lies between $-1/9$ and $+71/9$. Suppose $x$ has ten or more
$D$-decimal representations. Then for sufficiently large $k$,
$10↑kx$ has ten representations that differ to the left of the
decimal point: $10↑kx = n↓1 + f↓1 = \cdots = n↓{10} + f↓{10}$
where each $f↓j$ is a $D$-fraction. By uniqueness of integer representations,
the $n↓j$ are distinct, say $n↓1 < \cdots < n↓{10}$, hence $n↓{10}
- n↓1 ≥ 9$; but this implies $f↓1 - f↓{10} ≥ 9 > 71/9 - (-1/9)$,
a contradiction.\xskip (c) Any number of the form $0.a↓1a↓2 \ldotsm
$, where each $a↓j$ is $-1$ or 8, equals $\overline1.a↓1↑\prime a↓2↑\prime
\,\ldots$ where $a↑\prime↓{j} = a↓j + 9$ (and it even has six
{\sl more} representations $\overline18.a↑{\prime\prime}↓1a↑{\prime\prime}↓2
\ldotsm$, etc.).

\ansno 21. We can convert to such a representation by using
a method like that suggested in the test for converting to balanced
ternary.

In contrast to the systems of exercise 20, zero
can be represented in infinitely many ways, all obtained from
${1\over 2} + \sum ↓{k≥1}(-4{1\over 2}) \cdot 10↑{-k}$ (or
from the negative of this representation) by multiplying it
by a power of ten. The representations of unity are $1{1\over
2} - {1\over 2}$, ${1\over 2} + {1\over 2}$, $5 - 3{1\over
2} - {1\over 2}$, $5 - 4{1\over 2} + {1\over 2}$, $50 - 45
- 3{1\over 2} - {1\over 2}$, $50 - 45 - 4{1\over 2} + {1\over
2}$, etc., where $\pm {1\over 2} = (\pm 4{1\over 2})(10↑{-1}
+ 10↑{-2} + \cdotss)$.\xskip [{\sl AMM \bf 57} (1950), 90--93.]

\ansno 22. Given some approximation $b↓n \ldotsm
b↓1b↓0$ with error $\sum ↓{0≤k≤n}b↓k10↑k
- x > 10↑{-t}$ for $t > 0$, we will show how to reduce the error
by approximately $10↑{-t}$.\xskip (The process can be started by finding
a suitable $\sum ↓{0≤k≤n} b↓k10↑k > x$; then a finite
number of reductions of this type will make the error less than
$ε$.)\xskip Simply choose $m > n$ so large that the decimal representation
of $-10↑mα$ has a one in position $10↑{-t}$ and no ones in positions
$10↑{-t+1}$, $10↑{-t+2}$, $\ldotss$, $10↑n$. Then $10↑mα + ($a suitable
sum of powers of 10 between $10↑m$ and $10↑n) + \sum ↓{0≤k≤n}
b↓k10↑k \approx \sum ↓{0≤k≤n}b↓k10↑k - 10↑{-t}$.

\ansno 23. The set $S = \leftset\sum ↓{k≥1} a↓kb↑{-k}\relv a↓k \in
D\rightset$ is closed as in exercise 18, hence measurable, and in fact
it has positive measure. Since $bS = \union↓{a\in D}(a + S)$, we
have $b\mu (S) = \mu (bS) ≤ \sum ↓{a\in D}\mu (a + S) = \sum
↓{a\in D}\mu (S) = b\mu (S)$, and we must have $\mu \biglp
(a + S) ∩ (a↑\prime + S)\bigrp = 0$ when $a ≠ a↑\prime \in D$.
Now $T$ has measure zero
since it is a union of countably many sets of the form $10↑k\biglp
n + ((a + S) ∩ (a↑\prime + S))\bigrp$, $a ≠ a↑\prime $, each
of measure zero.

[The set $T$ cannot be empty, since the real numbers
cannot be written as a countable union of disjoint, closed,
bounded sets; cf.\ {\sl AMM \bf 84} (1977), 827--828.
If $D$ has less than $b$ elements, the set of
numbers representable with radix $b$ and digits from $D$ has
measure zero. If $D$ has more than $b$ elements and represents all
reals, $T$ has infinite measure.]

\ansno 24. $\leftset 2a\cdot10↑k+a↑\prime\relv0≤a<5,0≤a↑\prime<2\rightset$
or $\leftset 5a↑\prime\cdot10↑k+a\relv0≤a<5,0≤a↑\prime<2\rightset$,
for $k ≥ 0$.\xskip [R. L. Graham has shown
that there are no more sets of integer digits with these properties.
And Andrew Odlyzko has shown that the restriction to integers
is superflous, in the sense that if the smallest two elements
of $D$ are 0 and 1, all the digits must be integers.\xskip {\sl Proof:}
Let $S=\leftset\sum↓{k<0}a↓kb↑k\relv a↓k\in D\rightset$
and $X = \leftset(a↓n \ldotsm a↓0)↓b \relv a↓k \in D\rightset$; then $[0, ∞) =
\union↓{x\in X}(x + S)$, and $(x + S) ∩ (x↑\prime + S)$ has measure
zero for $x ≠ x↑\prime \in X$. We have $(0, 1) \subset S$, and
by induction on $m$ we will prove that $(m, m + 1) \subset x↓m
+ S$ for some $x↓m \in X$. Let $x↓m\in X$ be such that $(m, m
+ ε)∩ (x↓m + S)$ has positive measure for all $ε > 0$. Then $x↓m
≤ m$, and $x↓m$ must be an integer lest $x↓{\lfloor x↓m\rfloor}
+ S$ overlap $x↓m + S$ too much. If $x↓m > 0$, the fact that
$(m - x↓m, m - x↓m + 1) ∩ S$ has positive measure implies by
induction that this measure is 1, and $(m, m + 1) \subset x↓m
+ S$ since $S$ is closed. If $x↓m = 0$ and $(m, m + 1) \not
\subset S$, we must have $m < x↑\prime↓{m}
< m + 1$ for some $x↑\prime↓{m} \in X$, where $(m, x↑\prime↓{m})
\subset S$; but then $1 + S$ overlaps $x↑\prime↓{m} + S$. Reference:
{\sl Proc.\ London Math.\ Soc.\ \bf18} (1978), to appear.]

[If we drop the restriction $0 \in D$, there {\sl
are} many other cases, some of which are quite interesting,
especially the sets $\{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}$, $\{1,
2, 3, 4, 5, 51, 52, 53, 54, 55\}$, and $\{2, 3, 4, 5, 6, 52, 53,
54, 55, 56\}$. Alternatively if we allow negative digits we obtain
many other solutions by the method of exercise 19, plus further
sets like $\{-1$, 0, 1, 2, 3, 4, 5, 6, 7, $18\}$ which don't meet
the conditions of that exercise; it appears hopeless to find
a nice characterization of all solutions with negative digits.]

\ansno 25. A positive number whose base $b$ representation has
$m$ consecutive $(b - 1)$'s to the right of the decimal point
must have the form $c/b↑n + (b↑m - \theta )/b↑{n+m}$, where
$c$ and $n$ are nonnegative integers and $0 < \theta ≤1$. So if $u/v$
has this form, we find that $b↑{m+n}u = b↑mcv + b↑mv - \theta
v$. Therefore $\theta v$ is an integer that is a multiple of
$b↑m$. But $0 < \theta v ≤ v < b↑m$.\xskip $\biglp$There can be arbitrarily
long runs of other digits $aaaaa$, if $0 ≤ a < b - 1$,
for example in the representation of $a/(b - 1).\bigrp$
%folio 731 galley 2a (C) Addison-Wesley 1978	*
\ansno 26. The proof of ``sufficiency'' is a straightforward
generalization of the usual proof for base $b$, by successively
constructing the desired representation. The proof of ``necessity''
breaks into two parts: If $β↓{n+1}$ is greater than $\sum ↓{k≤n}
c↓kβ↓k$ for some $n$, then $β↓{n+1} - ε$ has no representation
for small $ε$. If $β↓{n+1} ≤ \sum ↓{k≤n} c↓kβ↓k$ for all
$n$, but equality does not always hold, we can show that there are
two representations for certain $x$.\xskip [See {\sl Transactions
of the Royal Society of Canada}, series III, {\bf 46} (1952),
45--55.]

\ansno 27. Proof by induction on $|n|$: If $n$ is even we must
take $e↓0 > 0$, and the result follows by induction, since
$n/2$ has a unique such representation. If $n$ is odd, we must
take $e↓0 = 0$, and the problem reduces to representing $-(n
- 1)/2$; if the latter quantity is either zero or one, there
is obviously only one way to proceed, otherwise it has a unique reversing
representation by induction.

[It follows that every positive integer has exactly {\sl two} such
representations with {\sl decreasing} exponents $e↓0>e↓1>\cdots>e↓t$:
one with $t$ even and the other with $t$ odd.]

\ansno 28. A proof like that of exercise 27 may be given. Note
that $a + bi$ is a multiple of $1 + i$ by a complex integer
if and only if $a + b$ is even.

\ansno 29. It suffices to prove that any collection $\{T↓0, T↓1,
T↓2, \ldotss\}$ satisfying Property B may be obtained by collapsing
some collection $\{S↓0, S↓1, S↓2, \ldotss\}$, where $S↓0 = \{0, 1,
\ldotss , b - 1\}$ and all elements of $S↓1$, $S↓2$, $\ldots$ are
multiples of $b$.

To prove the latter statement, we may assume that
$1 \in T↓0$ and that there is a least element $b > 1$ such
that $b \notin T↓0$. We will prove, by induction on $n$,
that if $nb \not\in T↓0$, then $nb + 1$, $nb + 2$, $\ldotss$,
$nb + b - 1$ are not in any of the $T↓j$'s; but if $nb \in
T↓0$, then so are $nb + 1$, $\ldotss$, $nb + b - 1$. The result then
follows with $S↓1 = \leftset nb \relv nb \in T↓0\rightset$, $S↓2 = T↓1$, $S↓3 =
T↓2$, etc.

If $nb \notin T↓0$, then $nb = t↓0 + t↓1+
\cdotss$, where $t↓1$, $t↓2$, $\ldots$ are multiples of $b$; hence
$t↓0 < nb$ is a multiple of $b$. By induction, $(t↓0 + k) +
t↓1 + t↓2 + \cdots$ is the representation of $nb
+ k$, for $0 < k < b$; hence $nb + k \notin T↓j$ for any
$j$.

If $nb \in T↓0$ and $0 < k < b$, let the representation
of $nb + k$ be $t↓0 + t↓1 + \cdotss$. We cannot have $t↓j =
nb + k$ for $j ≥ 1$, lest $nb + b$ have two representations
$(b - k) + \cdots + (nb + k) + \cdots = (nb) + \cdots + b +
\cdotss$. By induction, $t↓0 \mod b = k$; and the representation
$nb = (t↓0 - k) + t↓1 + \cdots$ implies that $t↓0
= nb + k$.

[Reference: {\sl Nieuw Archief voor Wiskunde} (3)
{\bf 4} (1956), 15--17. A finite analog of this result was derived
by P. A. MacMahon, {\sl Combinatory Analysis \bf 1} (1915),
217--223.]

\ansno 30. (a)\9 Let $A↓j$ be the set of numbers $n$ whose representation
does not involve $b↓j$; then by the uniqueness property, $n
\in A↓j$ iff $n + b↓j \notin A↓j$. Consequently $n \in
A↓j$ iff $n + 2b↓j \in A↓j$. It follows that, for $j ≠ k$, $n \in
A↓j ∩ A↓k$ iff $n + 2b↓jb↓k \in A↓j ∩ A↓k$. Let $m$ be the number
of integers $n \in A↓j ∩ A↓k$ such that $0 ≤ n < 2b↓jb↓k$. Then
this interval contains exactly $m$ integers that are in
$A↓j$ but not $A↓k$, exactly $m$ in $A↓k$ but not $A↓j$, and
exactly $m$ in neither $A↓j$ nor $A↓k$; hence $4m = 2b↓jb↓k$.
Therefore $b↓j$ and $b↓k$ cannot both be odd. But at least one
$b↓j$ is odd, of course, since odd numbers can be represented.\xskip
(b) According to (a) we can renumber the $b$'s so that $b↓0$ is odd and
$b↓1$, $b↓2$, $\ldots$ are even; then ${1\over2}b↓1$, ${1\over2}b↓2$,
$\ldots$ must also be a binary basis, and the process can be iterated.

(c)\9 If it is a binary basis, we must have positive and negative $d↓k$'s for
arbitrarily large $k$, in order to represent $\pm2↑n$ when $n$ is large.
Conversely, the following algorithm may be used:

\algstep S1. [Initialize.] Set $k←0$.

\algstep S2. [Done?] If $n=0$, terminate.

\algstep S3. [Choose.] If $n$ is odd, include $2↑kd↓k$ in the representation,
and replace $n$ by $(n-d↓k)/2$. Otherwise set $n←n/2$.

\algstep S4. [Advance $k$.] Increase $k$ by 1 and return to S2.\quad\blackslug

\yskip At each step the choice is forced; furthermore
step S3 decreases $|n|$ unless $n=-d↓k$, hence the algorithm must terminate.

(d)\9 Two iterations of steps S2--S4 in the preceding algorithm will transform
$4m→m$, $4m+1→m+5$, $4m+2→m+7$, $4m+3→m-1$. Arguing as in exercise 19, we
need only show that the algorithm terminates for $-2≤n≤8$; all other values of
$n$ are moved toward this interval. In this
range $3 → -1 → -2 → 6 → 8 → 2 → 7 → 0$ and $4 → 1 → 5 → 6$.
Thus $1 = 7 \cdot 2↑0 - 13 \cdot 2↑1 + 7 \cdot 2↑2 - 13 \cdot
2↑3 - 13 \cdot 2↑5 - 13 \cdot 2↑9 + 7 \cdot 2↑{10}$.

{\sl Note:} The choice $d↓0$, $d↓1$, $d↓2$, $\ldots
= 5$, $-3$, 3, 5, $-3$, 3, $\ldots$ also yields a binary basis. For
further details see {\sl Math.\ Comp.\ \bf 18} (1964), 537--546;
A. D. Sands, {\sl Acta Mathematica}, Acad.\ Sci.\ Hung., {\bf 8}
(1957), 65--86.

\ansno 31. (See also the related exercises 3.2.2--11, 4.3.2--13,
4.6.2--22.)

(a)\9 By multiplying numerator and denominator by suitable powers
of 2, we may assume that $u = (\ldotsm u↓2u↓1u↓0)↓2$ and $v =
(\ldotsm v↓2v↓1v↓0)↓2$ are 2-adic integers,
where $v↓0 = 1$. The following computational
method now determines $w$, using the notation $u↑{(n)}$ to stand
for the integer $(u↓{n-1} \ldotsm u↓0)↓2 = u \mod 2↑n$ when
$n > 0$:

Let $w↓0 = u↓0$. For $n = 1$, 2, $\ldotss$, assume
that we have found an integer $w↑{(n)} = (w↓{n-1} \ldotsm w↓0)↓2$
such that $u↑{(n)} ≡ v↑{(n)}w↑{(n)} \modulo{2↑n}$.
Then we have $u↑{(n+1)} ≡ v↑{(n+1)}w↑{(n)}
\modulo {2↑n}$, hence $w↓n = 0$ or 1 according as
$(u↑{(n+1)} - v↑{(n+1)}w↑{(n)})\mod 2↑{n+1}$ is 0 or $2↑n$.

(b)\9 Find the smallest integer $k$ such that $2↑k
≡ 1 \modulo {2n + 1}$. Then we have $1/(2n + 1) = m/(2↑k - 1)$ for
some integer $m$, $1 ≤ m < 2↑{k-1}$. Let $α$ be the $k$-bit binary
representation of $m$; then $(0.ααα\ldotsm)↓2$ times $2n + 1$
is $(0.111\ldotsm)↓2 = 1$ in the binary system, and $(\ldotsm ααα)↓2$
times $2n + 1$ is $(\ldotsm 111)↓2 = -1$ in the 2-adic system.

(c)\9 If $u$ is rational, say $u = m/2↑en$ where
$n$ is odd and positive, the 2-adic representation of $u$ is
periodic, because the set of numbers with periodic expansions
includes $-1/n$ and is closed under the operations of negation,
division by 2, and addition. Conversely, if $u↓{N+λ}=u↓N$ for all
$N≥\mu$, the 2-adic number $(2↑λ-1)2↑{-\mu}u$ is an integer.

(d)\9 The square of any number of the form $(\ldotsm
u↓2u↓11)↓2$ has the form $(\ldotsm 001)↓2$, hence the condition
is necessary. To show the sufficiency, we can use the following procedure
to compute $v = \sqrt{n}$ when $n \mod 8 = 1$:

\algstep H1. [Initialize.] Set $m ← (n - 1)/8$, $k ← 2$, $v↓0 ←
1$, $v↓1 ← 0$, $v ← 1$.\xskip (During this algorithm we will have $v=(v↓{k-1}\ldotsm
v↓1v↓0)↓2$ and $v↑2=n-2↑{k+1}m$.)

\algstep H2. [Transform.] If $m$ is even, set $v↓k ← 0$,
$m ← m/2$. Otherwise set $v↓k ← 1$, $m ← (m - v - 2↑{k-1})/2$, $v
← v + 2↑k$.

\algstep H3. [Advance $k$.] Increase $k$ by 1 and return to H2.\quad\blackslug

\ansno 32. A generalization appears
in {\sl Math.\ Comp.\ \bf 29} (1975), 84--86.

%folio 733 galley 2b (C) Addison-Wesley 1978	*
\ansbegin{4.2.1}

\def\\{\raise 3.1pt\hbox{$\scriptscriptstyle\!\!\not\,\,$}} % \\h will be h bar
\ansno 1. $N = (62, +.60\ 22\ 52\ 00)$;
$\\h = (37, +.10\ 54\ 50\ 00)$. Note that $10\\h$ would be $(38, +.01\
05\ 45\ 00)$.

\ansno 2. $b↑{E-q}(1 - b↑{-p})$, $b↑{-q-p}$;\xskip $b↑{E-q}(1 - b↑{-p})$,
$b↑{-q-1}$.

\ansno 3. When $e$ does not have its smallest value,
the most significant ``one'' bit (which appears in all such
normalized numbers) need not appear in the computer word.

\ansno 4. $(51, +.10209877)$; $(50, +.12346000)$; $(53, +.99999999)$.
The third answer would be $(54, +.10000000)$ if the first operand
had been $(45, -.50000000)$.

\ansno 5. If $x ~ y$ and $m$ is an integer then $mb + x ~ mb
+ y$. Furthermore $x~y$ implies $x/b~y/b$, by considering all
possible cases. Another crucial property is that $x$ and $y$ will
round to the same integer, whenever $x ~ y$.

Now if $b↑{-p-2}F↓v ≠ f↓v$ we must have $(b↑{p+2}f↓v)\mod
b ≠ 0$; hence the transformation leaves $f↓v$ unchanged unless
$e↓u - e↓v ≥ 2$. Since $u$ was normalized, it is nonzero and
$|f↓u + f↓v| > b↑{-1} - b↑{-2} ≥ b↑{-2}$: the leading nonzero
digit of $f↓u + f↓v$ must be at most two places to the right
of the radix point, and the rounding operation will convert
$b↑{p+j}(f↓u + f↓v)$ to an integer, where $j ≤ 1$. The proof
will be complete if we can show that $b↑{p+j+1}(f↓u + f↓v)
~ b↑{p+j+1}(f↓u + b↑{-p-2}F↓v)$. By the previous paragraph,
we have $b↑{p+2}(f↓u + f↓v) ~ b↑{p+2}f↓u + F↓v = b↑{p+2}(f↓u
+ b↑{-p-2}F↓v)$, which implies the desired result for all $j
≤ 1$.

Note that, when $b > 2$ is even, such an integer
$F↓v$ always exists; but when $b = 2$ we require $p + 3$ bits
(let $2F↓v$ be an integer). When $b$ is odd, an integer $F↓v$
always exists except in the case of division, when a remainder
of ${1\over 2}b$ is possible.

\ansno 6. (Consider the case $e↓u = e↓v$, $f↓u = -f↓v$ in Program
A\null.)\xskip Register A retains its previous sign, as in \.{ADD}.

\ansno 7. Say that a number is normalized iff it is zero or
its fraction part satisfies ${1\over 6} < |f| < {1\over 2}$.
A $(p + 1)$-place accumulator suffices for addition and subtraction;
rounding (except during division) is equivalent to truncation.
A very pleasant system indeed! We might represent numbers with
excess-zero exponent, inserted between the first and subsequent
digits of the fraction, and complemented if the fraction is
negative, so that fixed-point order is preserved.

\ansno 8. (a)\9 $(06, +.12345679) \oplus (06, -.12345678)$,\xskip $(01,
+.10345678) \oplus (00, -.94000000)$;\xskip\penalty1000 (b) $(99, +.87654321) \oplus
\hbox{itself}$,\xskip $(99, +.99999999) \oplus (91, +.50000000)$.

\ansno 9. $a = (-50, +.10000000)$, $b = (-41, +.20000000)$, $c =
a$, $d = (-41, +.80000000)$, $y = (11, +.10000000)$.

\ansno 10. $(50, +.99999000) \oplus (55, +.99999000)$.
%folio 735 galley 3a (C) Addison-Wesley 1978	*
\ansno 11. $(50, +.10000001) \otimes (50, +.99999990)$.

\ansno 12. If $0 < |f↓u| < |f↓v|$, then $|f↓u| ≤ |f↓v| - b↑{-p}$;
hence $1/b < |f↓u/f↓v| ≤ 1 - b↑{-p}/|f↓v| < 1 - b↑{-p}$. If
$0 < |f↓v| ≤ |f↓u|$, we have $1/b < |f↓u/f↓v|/b ≤ \biglp(1 - b↑{-p})/(1/b)\bigrp/b
= 1 - b↑{-p}$.

\ansno 13. See J. Michael Yohe, {\sl IEEE Transactions} {\bf C--22}
(1973), 577--586; cf.\ also exercise 4.2.2--24.

\mixans 14. {⊗FIX⊗STJ⊗9F⊗Float-to-fix subroutine:\cr
⊗⊗STA⊗TEMP\cr
⊗⊗LD1⊗TEMP(EXP)⊗$\rI1 ← e$.\cr
⊗⊗SLA⊗1⊗$\rA ← \pm\,f\,f\,f\,f\,0$.\cr
⊗⊗JAZ⊗9F⊗Is input zero?\cr
⊗⊗DEC1⊗1\cr
⊗⊗CMPA⊗=0=(1:1)⊗If leading byte is zero,\cr
⊗⊗JE⊗*-4⊗\quad shift left again.\cr
\\
⊗⊗ENN1⊗-Q-4,1\cr
⊗⊗J1N⊗FIXOVFLO⊗Is magnitude too large?\cr
\\
⊗⊗ENTX⊗0\cr
⊗⊗SRAX⊗0,1\cr
\\
⊗⊗CMPX⊗=1//2=\cr
⊗⊗JL⊗9F\cr
⊗⊗JG⊗*+2\cr
⊗⊗JAO⊗9F\cr
\\
⊗⊗STA⊗*+1(0:0)⊗Round, if necessary.\cr
⊗⊗INCA⊗1⊗Add $\pm 1$ (overflow is impossible).\cr
⊗9H⊗JMP⊗*⊗Exit from subroutine.\quad\blackslug\cr}

\mixans 15. {⊗FP⊗STJ⊗EXITF⊗Fractional part subroutine:\cr
⊗⊗JOV⊗OFLO⊗Ensure overflow is off.\cr
⊗⊗STA⊗TEMP⊗$\.{TEMP} ← u$.\cr
⊗⊗ENTX⊗0\cr
⊗⊗SLA⊗1⊗$\rA ← f↓u$.\cr
\\
⊗⊗LD2⊗TEMP(EXP)⊗$\rI2 ← e↓u$.\cr
⊗⊗DEC2⊗Q\cr
⊗⊗J2NP⊗*+3\cr
⊗⊗SLA⊗0,2⊗Remove integer part of $u$.\cr
⊗⊗ENT2⊗0\cr
⊗⊗JANN⊗1F\cr
\\
⊗⊗ENN2⊗0,2⊗Fraction is negative: find\cr
⊗⊗SRAX⊗0,2⊗\quad its complement.\cr
⊗⊗ENT2⊗0\cr
⊗⊗JAZ⊗*+2\cr
⊗⊗INCA⊗1\cr
⊗⊗ADD⊗WM1⊗Add word size minus one.\cr
\\
⊗1H⊗INC2⊗Q⊗Prepare to normalize the answer.\cr
⊗⊗JMP⊗NORM⊗Normalize, round, and exit.\cr
⊗8H⊗EQU⊗1(1:1)\cr
⊗WM1⊗CON⊗8B-1,8B-1(1:4)⊗Word size minus one\quad\blackslug\cr}

\ansno 16. If $|c| ≥ |d|$, then set $r ← d \odiv c$,
$s ← c \oplus (r \otimes d)$; $x ← \biglp a \oplus (b \otimes r)\bigrp
\odiv s$, $y ← \biglp b \ominus (a \otimes r)\bigrp \odiv
s$. Otherwise set $r ← c \odiv d$, $s ← d \oplus (r \otimes
c)$; $x ← \biglp (a \otimes r) \oplus b\bigrp \odiv s$, $y ← \biglp
(b \otimes r) \ominus a\bigrp \odiv s$. Then $x + iy$ is the
desired approximation to $(a + bi)/(c + di)$.\xskip [{\sl CACM \bf 5} (1963),
435. Other algorithms for complex arithmetic and function evaluation
are given by P. Wynn, {\sl BIT \bf 2} (1962), 232--255; see
also Paul Friedland, {\sl CACM \bf 10} (1967), 665.]

\ansno 17. See Robert Morris, {\sl IEEE Transactions} {\bf C--20}
(1971), 1578--1579. Error analysis is more difficult with such
systems, so interval arithmetic is correspondingly more desirable.

\ansno 18. For positive numbers: shift fraction left until $f↓1
= 1$, then round, then if the fraction is zero (rounding overflow)
shift it right again. For negative numbers: shift fraction left
until $f↓1 = 0$, then round, then if the fraction is zero (rounding
underflow) shift it right again.

\ansno 19. $\biglp 43 + (1$ if $e↓v < e↓u) - (1$ if fraction overflow$)
- (10$ if result zero$) + (4$ if magnitude
is rounded up$) + (1$ if first rounding digit is $b/2) + (5$ if
rounding digits are $b/2\ 0 \ldotsm 0) + (7$ if rounding overflow$)
+ 7N + A(-1 + (11$ if $N > 0))\bigrp u$, where $N$ is the number
of left shifts during normalization, $A = 1$ if rX receives
nonzero digits (otherwise $A = 0$). The maximum time of $73u$
occurs for example when $$u = +50\ 01\ 00\ 00\ 00,\quad v = -46\ 49\ 99\
99\ 99,\quad b = 100.$$ [The average time, considering the
data in Section 4.2.4, will be about $45{1\over 2}u$.]

%folio 738 galley 3b (C) Addison-Wesley 1978	*
\ansbegin{4.2.2}

\ansno 1. $u \ominus v = u \oplus -v
= -v \oplus u = -(v \oplus -u) = -(v \ominus u)$.

\ansno 2. $u \oplus x ≥ u \oplus 0 = u$, by (8), (2), (6); hence
by (8) again, $(u \oplus x) \oplus v ≥ u \oplus v$. Similarly,
(8) and (6) together with (2) imply that $(u \oplus x) \oplus
(v \oplus y) ≥ (u \oplus x) \oplus v$.

\ansno 3. $u = 8.0000001$, $v = 1.2500008$, $w = 8.0000008$;
$(u \otimes v) \otimes w = 80.000064$, $u \otimes (v \otimes w)=
80.000057$.

\ansno 4. Yes; let $1/u \approx v = w$, where $v$
is large.

\ansno 5. Not always; in decimal arithmetic take $u = v = 9$.

\ansno 6. (a)\9 Yes.\xskip (b) Only for $b + p ≤ 4$ (try
$u = 1 - b↑{-p}$).\xskip [W. M. Kahan observes that the identity {\sl
does} hold whenever $b↑{-1} ≤ f↓u ≤ b↑{-1/2}$. It follows that
$1 \odiv \biglp 1 \odiv (1 \odiv u)\bigrp = 1 \odiv
u$ for all $u$.]

\def\expcirc#1{{\hbox to 9pt{\hfill
\hbox to 0pt{\hskip 0pt minus 100pt\raise 5.0833pt
\hbox{\:@\char'142}\hskip 0pt minus 100pt}\!
\hbox to 0pt{\hskip 0pt minus 100pt\:e#1\hskip 0pt minus 100pt}\hfill}}}
\ansno 7. If $u$ and $v$ are consecutive floating-binary numbers,
$u \oplus v = 2u$ or $2v$. When it is $2v$ we often have $u↑\expcirc2\oplus
v↑\expcirc2<2v↑\expcirc2$. For example,
$u = (.10 \ldotsm 001)↓2$, $v = (.10 \ldotsm 010)↓2$, $u \oplus v = 2v$,
and $u↑\expcirc2+v↑\expcirc2=(.10 \ldotsm 011)↓2$.

\ansno 8. (a)\9 $~$, $\approx$;\xskip (b) $~$, $\approx$;\xskip (c)
$~$, $\approx$;\xskip (d) $~$;\xskip (e) $~$.

\ansno 9. $|u - w| ≤ |u - v| + |v - w| ≤ ε↓1 \min(b↑{e↓u
-q}, b↑{e↓w-q}) + ε↓2 \min(b↑{e↓v-q}, b↑{e↓w
-q}) ≤ ε↓1b↑{e↓u-q}+ε↓2b↑{e↓w-q} ≤ (ε↓1
+ ε↓2)\max(b↑{e↓u-q}, b↑{e↓w-q})$. The result
cannot be strengthened in general, since for example we might
have $e↓u$ very small compared to both $e↓v$ and $e↓w$, and
this means that $u - w$ might be fairly large under the hypotheses.

\ansno 10. We have $(.a↓1 \ldotsm a↓{p-1}a↓p)↓b \otimes
(.9 \ldotsm 99)↓b = (.a↓1 \ldotsm a↓{p-1}(a↓p - 1))↓b$ if $a↓p
≥ 1$; here ``9'' stands for $b - 1$. Furthermore, $(.a↓1 \ldotsm
a↓{p-1}a↓p)↓b \otimes (1.0 \ldotsm 0)↓b = (.a↓1 \ldotsm a↓{p-1}0)↓b$, so
the multiplication is not monotone if $b>2$ and $a↓p≥2$. 
But when $b = 2$, this argument
can be extended to show that multiplication {\sl is} monotone; obviously
the ``certain computer'' had $b > 2$.

\ansno 11. Without loss of generality, let $x$ be an integer,
$|x| < b↑p$. If $e ≤ 0$ then $t = 0$. If $0 < e ≤ p$ then $x
- t$ has at most $p + 1$ digits, the least significant being
zero. If $e > p$ then $x - t = 0$.\xskip [The result holds also under
the weaker hypothesis $|t| < 2b↑e$.]

\def\dprime{↑{\prime\prime}}
\ansno 12. Assume that $e↓u = p$, $e↓v ≤ 0$, $u > 0$. Case 1, $u
> b↑{p-1}$. Case (1b), $w = u$, $|v| ≤ {1\over 2}$. Then $u↑\prime
= u$, $v↑\prime = 0$, $u\dprime = u$, $v\dprime = 0$. If $|v| = {1\over
2}$ and more general rounding is permitted we might also have
$u↑\prime = u \pm 1$, $v\dprime = \mp1$. Case (1c), $w = u - 1$, $v ≤ -{1\over 2}$,
$e↓v = 0$. Then $u↑\prime = u$ or $u - 1$, $v↑\prime = -1$, $u\dprime
= u$, $v\dprime = -1$ or 0. Case 2, $u = b↑{p-1}$. Case (2a),
$w = u + 1$, $v ≥ {1\over 2}$, $e↓v = 0$. Like (1a). Case (2b),
$w = u$, $|v| ≤ {1\over 2}$, $u↑\prime ≥ u$. Like (1b). Case (2c),
$w = u$, $|v| ≤ {1\over 2}$, $u↑\prime < u$. Then $u↑\prime = u
- j/b$ where $v = j/b + v↓1$ and $|v↓1| ≤ {1\over 2}b↑{-1}$ for
some positive integer $j ≤ {1\over 2}b$; we have $v↑\prime =
0$, $u\dprime = u$, $v\dprime = j/b$. Case (2d), $w < u$. Then $w
= u - j/b$ where $v = -j/b + v↓1$ and $|v↓1| ≤ {1\over 2}b↑{-1}$
for some positive integer $j ≤ b$; we have $(v↑\prime , u\dprime
) = (-j/b, u)$, and $(u↑\prime , v\dprime ) = (u, -j/b)$ or
$\biglp u - 1/b, (1 - j)/b\bigrp $, the latter case only when $v↓1
= {1\over 2}b↑{-1}$. In all cases $u \ominus u↑\prime = u -
u↑\prime$, $v \ominus v↑\prime = v - v↑\prime$, $u \ominus u↑\prime
= u - u\dprime$, $v \ominus v\dprime = v - v\dprime $, round$(w
- u - v) = w - u - v$.

\ansno 13. Since round$(x) = 0$ iff $x = 0$, we want to find a large set of
integer pairs $(m, n)$ with the property that $m \odiv n$ is
an integer iff $m/n$ is. Assume that $|m|, |n| < b↑p$. If $m/n$
is an integer then $m \odiv n = m/n$ is also. Conversely if
$m/n$ is not an integer, but $m\odiv n$ is, we have $1/|n|
≤ |m\odiv n - m/n| < {1\over 2} |m/n| b↑{1-p}$, hence $|m|
> 2b↑{p-1}$. Our answer is therefore to require $|m| ≤ 2b↑{p-1}$
and $0 < |n| < b↑p$.\xskip (Slightly weaker hypotheses are also possible.)

\ansno 14. $|(u \otimes v) \otimes w - uvw| ≤ |(u \otimes v)
\otimes w - (u \otimes v)w| + |w|\,|u \otimes v - uv| ≤ \delta
↓{(u\otimes v)\otimes w} + b↑{e↓w-q-l↓w}\delta ↓{u\otimes v}
≤ (1 + b) \delta ↓{(u\otimes v)\otimes w}$. Now $|e↓{(u\otimes v)\otimes w}
- e↓{u\otimes(v\otimes w)}| ≤ 2$, so we may take $ε = {1\over 2}(1
+ b)b↑{2-p}$.

\ansno 15. $u ≤ v$ implies that $(u \oplus u) \odiv 2 ≤ (u
\oplus v) \odiv 2 ≤ (v \oplus v) \odiv 2$, so the condition
holds for all $u$ and $v$ iff it holds whenever $u = v$. For
base $b = 2$, the condition is therefore always satisfied (barring
overflow); but for $b > 2$ there are numbers $v ≠ w$ such that
$v \oplus v = w \oplus w$, hence the condition fails.\xskip [On the
other hand, the formula $u \oplus \biglp (v \ominus u) \odiv
2\bigrp$ does give a midpoint in the correct range. {\sl Proof:
}It suffices to show that $u + (v \ominus u) \odiv 2 ≤
v$, i.e., $(v \ominus u) \odiv 2 ≤ v - u$; and it is easy
to verify that round$\biglp{1\over 2}\hbox{round}(x)\bigrp ≤ x$
for all $x ≥ 0$.]

\ansno 16. (a)\9 Exponent changes occur at $\sum ↓{10} = 11.111111$,
$\sum ↓{91} = 101.11111$, $\sum ↓{901} = 1001.1102$, $\sum ↓{9001}
= 10001.020$, $\sum ↓{90009} = 100000.91$, $\sum ↓{900819} = 1000000.0$;
therefore $\sum ↓{1000000} = 1109099.1$.

\vskip 2pt
(b)\9 $\sum ↓{1≤k≤n} 1.2345679
= 1224782.1$, and (14) tries to take the square root of $-.0053187053$.
But (15) and (16) are exact in this case.\xskip $\biglp$If $x↓k = 1 + \lfloor
(k - 1)/2\rfloor 10↑{-7}$, (15) and (16) have errors of order
$n$.$\bigrp$

(c)\9 We need to show that $u \oplus \biglp(v \ominus
u) \odiv k\bigrp$ lies between $u$ and $v$; see exercise 15.

\mixans 17. {⊗FCMP⊗STJ⊗9F⊗Floating point comparison subroutine:\cr
⊗⊗JOV⊗OFLO⊗Ensure overflow is off.\cr
⊗⊗STA⊗TEMP\cr
⊗⊗LDAN⊗TEMP⊗$v ← -v$.\cr
\noalign{\vskip 3pt
\hbox{\quad (Copy here lines 07-20 of Program 4.2.1A.)}\vskip3pt}
⊗⊗LDX⊗FV(0:0)⊗Set rX to zero with sign of $f↓v$.\cr
\\⊗⊗DEC1⊗5\cr
⊗⊗J1N⊗*+2\cr
⊗⊗ENT1⊗0⊗Replace large difference in exponents\cr
⊗⊗SRAX⊗5,1⊗\qquad by a smaller one.\cr
\\⊗⊗ADD⊗FU⊗$\rA ← \null$difference of operands.\cr
⊗⊗JOV⊗7F⊗Fraction overflow: not $~$.\cr
⊗⊗CMPA⊗EPSILON(1:5)\cr
⊗⊗JG⊗8F⊗Jump if not $~$.\cr
⊗⊗JL⊗6F⊗Jump if $~$.\cr
\\⊗⊗JXZ⊗9F⊗Jump if $~$.\cr
⊗⊗JXP⊗1F⊗If $|\rA| = ε$, check sign of $\rA \times \rX$.\cr
\\⊗⊗JAP⊗9F⊗Jump if $~$. $(\rA ≠ 0)$\cr
⊗⊗JMP⊗8F\cr
\\⊗7H⊗ENTX⊗1\cr
⊗⊗SRC⊗1⊗Make rA nonzero with same sign.\cr
⊗⊗JMP⊗8F\cr
\\⊗1H⊗JAP⊗8F⊗Jump if not $~$. $(\rA ≠ 0)$\cr
\\⊗6H⊗ENTA⊗0\cr
⊗8H⊗CMPA⊗=0=⊗Set comparison indicator.\cr
⊗9H⊗JMP⊗*⊗Exit from subroutine.\quad\blackslug\cr}
%folio 742 galley 4a (C) Addison-Wesley 1978	*
\ansno 19. Let $\gamma ↓k = \delta ↓k = \eta ↓k = \sigma
↓k = 0$ for $k > n$. It suffices to find the coefficient of
$x↓1$, since the coefficient of $x↓k$ will be the same except
with all subscripts increased by $k - 1$. Let $(f↓k, g↓k)$ denote
the coefficient of $x↓1$ in $(s↓k - c↓k, c↓k)$ respectively.
Then $f↓1 = (1 + \eta ↓1)(1 - \gamma ↓1 - \gamma ↓1\delta ↓1
- \gamma ↓1\sigma ↓1 - \delta ↓1\sigma ↓1 - \gamma ↓1 \delta
↓1\sigma ↓1)$, $g↓1 = (1 + \delta ↓1)(1 + \eta ↓1)(\gamma ↓1 +
\sigma ↓1 + \gamma ↓1\sigma ↓1)$, and $f↓k = (1 - \gamma ↓k\sigma
↓k - \delta ↓k\sigma ↓k - \gamma ↓k\delta ↓k\sigma ↓k)f↓{k-1}
+ (\gamma ↓k - \eta ↓k + \gamma ↓k \delta ↓k + \gamma ↓k\eta
↓k + \gamma ↓k \delta ↓k\eta ↓k + \gamma ↓k\eta ↓k\sigma ↓k
+ \delta ↓k\eta ↓k\sigma ↓k + \gamma ↓k \delta ↓k\eta ↓k\sigma
↓k)g↓{k-1}$, $g↓k = \sigma ↓k(1 + \gamma ↓k)(1 + \delta ↓k)f↓{k-1}
- (1 + \delta ↓k)(\gamma ↓k + \gamma ↓k\eta ↓k + \eta ↓k\sigma
↓k + \gamma ↓k\eta ↓k\sigma ↓k)g↓{k-1}$, for $1 < k ≤ n$. Thus
$f↓n = 1 + \eta ↓1 - \gamma ↓1 + (4n$ terms of 2nd order$) +
($higher order terms$) = 1 + \eta ↓1 - \gamma ↓1 + O(nε↑2)$ is
sufficiently small.\xskip [The Kahan summation formula was first published
in {\sl CACM \bf 8} (1965), 40; cf.\ {\sl Proc.\ IFIP Congress}
(1971), {\bf 2}, 1232. For another approach to accurate summation,
see R. J. Hanson, {\sl CACM \bf 18} (1975), 57--58. See also G. Bohlender,
{\sl IEEE Trans.\ \bf C--26} (1977), 621--632, for algorithms that compute
round$(x↓1+\cdots+x↓n)$ and round$(x↓1\ldotsm x↓n)$ {\sl exactly}, given
$\{x↓1,\ldotss,x↓n\}$.]

\ansno 20. By the proof of Theorem C\null, (47) fails for $e↓w =
p$ only if $|v| + {1\over 2} ≥ |w - u| ≥ b↑{p-1} +
b↑{-1}$; hence $|f↓u| ≥ |f↓v| ≥ 1 - ({1\over 2}b - 1)b↑{-p}$.
This rather rare case, in which $|f↓w|$ before normalization
takes its maximum value 2, is necessary and sufficient for failure.

\ansno 21. (Solution by G. W. Veltkamp.)\xskip Let $c = 2↑{\lceil p/2\rceil}
+ 1$; we may assume that $p ≥ 2$, so $c$ is representable.
First compute $u↑\prime = u \otimes c$, $u↓1 = (u \ominus u↑\prime
) \oplus u↑\prime$, $u↓2 = u \ominus u↓1$; similarly, $v↑\prime = v \otimes
c$, $v↓1 = (v \ominus v↑\prime ) \oplus v↑\prime$, $v↓2 = v \ominus
v↓1$. Then set $w ← u \otimes v$, $w↑\prime ← \biglp ((u↓1 \otimes
v↓1\ominus w) \oplus (u↓1 \otimes v↓2)) \oplus (u↓2 \otimes v↓1)\bigrp
\oplus (u↓2 \otimes v↓2)$.

It suffices to prove this when $u, v > 0$ and
$e↓u = e↓u = p$, so that $u$ and $v$ are integers $\in [2↑{p-1},
2↑p)$. Then $u = u↓1 + u↓2$ where $2↑{p-1} ≤ u↓1 ≤ 2↑p$, $u↓1
\mod 2↑{\lceil p/2\rceil}= 0$, and $|u↓2| ≤ 2↑{\lceil p/2\rceil-1}$;
similarly $v = v↓1 + v↓2$. The operations during the calculation
of $w↑\prime$ are exact, because $w - u↓1v↓1$ is a multiple
of $2↑{p-1}$ with $|w - u↓1v↓1| ≤ |w - uv| + |u↓2v↓1 + u↓1v↓2
+ u↓2v↓2| ≤ 2↑{p-1} + 2↑{p+\lceil p/2\rceil}+ 2↑{p-1}$;
and $|w - u↓1v↓1 - u↓1v↓2| ≤ |w - uv| + |u↓2v| < 2↑{p-1} +
2↑{\lceil p/2\rceil-1+p}$, where $w - u↓1v↓1 - u↓1v↓2$ is a multiple
of $2↑{\lceil p/2\rceil}$.

\ansno 22. We may assume that $b↑{p-1} ≤ u, v < b↑p$.
If $uv ≤ b↑{2p-1}$ then $x↓1 = uv - r$ where $|r| ≤ {1\over
2}b↑{p-1}$, hence $x↓2 =\hbox{round}(u - r/v) = x↓0$ (since $|r/v|
≤ {1\over 2}b↑{p-1}/b↑{p-1} ≤ {1\over 2}$, and equality implies
$v = b↑{p-1}$ hence $r = 0$). If $uv > b↑{2p-1}$ then $x↓1 =
uv - r$ where $|r| ≤ {1\over 2}b↑p$, hence $x↓1/v ≤ u - r/v
< b↑p + {1\over 2}b$ and $x↓2 ≤ b↑p$. If $x↓2 = b↑p$ then $x↓3
= x↓1$ $\biglp$for otherwise $(b↑p - {1\over 2})v ≤ x↓1 ≤ b↑p(v - 1)\bigrp$.
If $x↓2 < b↑p$ and $x↓1 > b↑{2p-1}$ then let $x↓2 = x↓1/v +
q$ where $|q| ≤ {1\over 2}$; we have $x↓3 =\hbox{round}(x↓1+
qv) = x↓1$. Finally if $x↓2 < b↑p$ and $x↓1 = b↑{2p-1}$ and
$x↓3 < b↑{2p-1}$ then $x↓4 = x↓2$ by the first case above. This
situation arises e.g.\ for $b = 10$, $p = 2$, $u = 19$, $v = 55$, $x↓1
= 1000$, $x↓2 = 18$, $x↓3 = 990$.

\def\omod{\hbox to 25pt{\hfill$\vcenter{\hbox{\:@\char'140}}$\hfill}\hskip-25pt
\raise .3pt\hbox to 25pt{\hfill$\vcenter{\moveleft .2pt\hbox{\:emod}}$\hfill}}
\ansno 23. Let $\lfloor u\rfloor = n$, so that
$u\omod 1 = u \ominus n = u - n + r$ where $|r| ≤ {1\over 2}b↑{-p}$;
we wish to show that round$(n - r) = n$. The result is clear
if $|n| > 1$; and $r = 0$ when $n = 0$ or 1, so the only subtle
case is when $n = -1$, $r = -{1\over 2}b↑{-p}$. The identity fails
iff $b$ is a multiple of 4 and $-b↑{-1} < u < -b↑{-2}$ and $u
\mod 2b↑{-p} = {3\over 2}b↑{-p}$ $\biglp$e.g., $p = 3$, $b = 8$,
$u = -(.0124)↓8\bigrp$.

\def\upper#1{\mathop{\hbox to 8.889pt{\:u\char'156\hskip-8.889pt\hfill
$\vcenter{\hbox{$\scriptscriptstyle#1$}}$\hfill}}}
\def\lwr#1{\mathop{\hbox to 8.889pt{\:u\char'157\hskip-8.889pt\hfill
$\vcenter{\hbox{$\scriptscriptstyle#1$}}$\hfill}}}
\ansno 24. Let $u = [u↓{\,l}, u↓r]$, $v = [v↓{\,l}, v↓r]$. Then $u \oplus v =
[u↓{\,l}\lwr+v↓{\,l},u↓r\upper+v↓r]$,
where $x\upper+y=y\upper+x$, $x \upper+
+0 = x$ for all $x$, $x\upper+ - 0 = x$ for all
$x ≠ +0$, $x\upper+ +∞ = +∞$ for all $x≠-∞$, and
$x \upper+ -∞$ needn't be defined; $x \lwr+
y = -\biglp (-x) \upper+ (-y)\bigrp$.
If $x\oplus y$ would overflow in normal floating-point arithmetic because
$x+y$ is too large, then $x\upper+y$ is $+∞$ and
$x\lwr+y$ is the largest representable number.

For subtraction, let $u \ominus v = u \oplus (-v)$, where $-v = [-v↓r, -v↓{\,l}]$.

Multiplication is somewhat more complicated. The correct procedure is to
let $$u \otimes v = \hbox{\:a[}\min(u↓{\,l}\lwr\times v↓{\,l},u↓{\,l}\lwr\times v↓r,u↓r
\lwr\times v↓{\,l},u↓r\lwr\times v↓r),\;\max(u↓{\,l}\upper\times v↓{\,l},
u↓{\,l}\upper\times v↓r,u↓r\upper\times v↓{\,l}, u↓r\upper\times v↓r)\hbox{\:a]},$$where
$x\upper\times y=y\upper\times x$, $x\upper\times(-y)=-(x\lwr\times y)=
(-x)\upper\times y$; $x\upper\times +0=(+0$ for $x>0$, $-0$ for $x<0)$;
$x\upper\times -0=-(x\upper\times+0)$; $x\upper\times+∞=(+∞$ for
$x>+0$, $-∞$ for $x<-0)$.\xskip $\biglp$It
 is possible to determine the min and max simply
by looking at the signs of $u↓{\,l}$, $u↓r$, $v↓{\,l}$, and $v↓r$, thereby computing
only two of the eight products, except when $u↓{\,l}<0<u↓r$ and $v↓{\,l}<0<v↓r$; in the
latter case we compute four products, and the answer is $[\min(u↓{\,l}\lwr\times v↓r,
u↓r\lwr\times
v↓{\,l}),\max(u↓{\,l}\upper\times v↓{\,l}, u↓r\upper\times v↓r)]$.$\bigrp$

Finally, $u \odiv v$ is undefined if $v↓{\,l} < 0
< v↓r$; otherwise we use the formulas for multiplication with
$v↓{\,l}$ and $v↓r$ replaced respectively by $v↓r↑{-1}$ and $v↓{\,l}↑{-1}$,
 where $x \upper\times y↑{-1}=x\upper/y$,
$x\lwr\times y↑{-1}=x\lwr/y$, $(\pm0)↑{-1}=\pm∞$, $(\pm∞)↑{-1}=\pm0$.

$\biglp$Cf.\
 E. R. Hansen, {\sl Math.\ Comp.\ \bf22} (1968), 374--384. An alternative
scheme, in which division by 0 gives no error messages and intervals may be
neighborhoods of $∞$, has been proposed by W. M. Kahan. In Kahan's scheme, for
example, the reciprocal of $[-1,+1]$ is $[+1,-1]$, and an attempt to multiply an
interval containing 0 by an interval containing $∞$ yields $[-∞,+∞]$, the
set of all numbers. See {\sl Numerical Analysis}, Univ.\ Michigan Engineering
Summer Conf.\ Notes No.\ 6818 (1968).$\bigrp$

\ansno 25. Cancellation reveals {\sl previous} errors in the
computation of $u$ and $v$. For example, if $ε$ is small, we
often get poor accuracy when computing $f(x + ε) \ominus f(x)$,
because the rounded calculation of $f(x + ε)$ destroys much
of the information about $ε$. It is desirable to rewrite
such formulas as $ε \otimes g(x, ε)$, where $g(x, ε) = \biglp
f(x + ε) - f(x)\bigrp/ε$ is first computed symbolically.
Thus, if $f(x) = x↑2$ then $g(x, ε) = 2x + ε$; if $f(x) = 
\sqrt{x}$ then $g(x, ε) = 1/(\sqrt{x
+ ε} + \sqrt{x}\,)$.

\ansno 26. See {\sl Math.\ Comp.\ \bf32} (1978), 227--232.
%folio 745 galley 4b (C) Addison-Wesley 1978	*
\ansbegin{4.2.3}

\ansno 1. First, $(w↓m, w↓{\,l}) = (.573,
.248)$; then $w↓mv↓{\,l}/v↓m = .290$; so the answer is $(.572, .958)$.
This in fact is the correct result to six decimals.

\ansno 2. The answer is not affected, since the normalization
routine truncates to eight places and can never look at this
particular byte position.\xskip (Scaling to the left occurs at most
once during normalization, since the inputs are normalized.)

\ansno 3. Overflow obviously cannot occur at line 09, since
we are adding two-byte quantities, or at line 22, since we are
adding four-byte quantities. In line 30 we are computing the
sum of three four-byte quantities, so this cannot overflow.
Finally, in line 32, overflow is impossible because the product
$f↓uf↓v$ must be less than unity.

\ansno 4. Insert ``\.{JOV OFLO; ENT1 0}'' between lines 03 and 04.
Replace lines 21--22 by ``\.{ADD TEMP(ABS);} \.{JNOV *+2;} \.{INC1 1}'',
and change lines 28--31 to ``\.{SLAX 5;} \.{ADD TEMP;} \.{JNOV *+2;} \.{INC1
1;} \.{ENTX 0,1;} \.{SRC 5}''. This adds five lines of code and only
1, 2, or 3 units of execution time.

\ansno 5. Insert ``\.{JOV OFLO}'' after line 06. Change lines 22,
31, 39 respectively to ``\.{SRAX 0,1}'', ``\.{SLAX 5}'', ``\.{ADD ACC}''.
Between lines
40 and 41, insert ``\.{DEC2 1;} \.{JNOV DNORM;} \.{INC2 1;} \.{INCX 1;} \.{SRC
1}''.\xskip(It's tempting to remove the ``\.{DEC 1}'' in favor of ``\.{STZ EXPO}'',
but then ``\.{INC2 1}'' might overflow rI2!)\xskip This adds six lines of
code; the running time {\sl decreases} by $3u$, unless there is fraction
overflow, when it increases by $7u$.

\mixans 6. {⊗DOUBLE⊗STJ⊗EXITDF⊗Convert to double precision:\cr
⊗⊗ENTX⊗0⊗Clear rX.\cr
⊗⊗STA⊗TEMP\cr
\\⊗⊗LD2⊗TEMP(EXP)⊗$\rI2 ← e$.\cr
⊗⊗INC2⊗QQ-Q⊗Correct for difference in excess.\cr
\\⊗⊗STZ⊗EXPO⊗$\.{EXPO} ← 0$.\cr
⊗⊗SLAX⊗1⊗Remove exponent.\cr
⊗⊗JMP⊗DNORM⊗Normalize and exit.\cr
\noalign{\yskip}
⊗SINGLE⊗STJ⊗EXITF⊗Convert to single precision:\cr
⊗⊗JOV⊗OFLO⊗Ensure overflow is off.\cr
\\⊗⊗STA⊗TEMP\cr
⊗⊗LD2⊗TEMP(EXPD)⊗$\rI2 ← e$.\cr
⊗⊗DEC2⊗QQ-Q⊗Correct for difference in excess.\cr
\\⊗⊗SLAX⊗2⊗Remove exponent.\cr
⊗⊗JMP⊗NORM⊗Normalize, round, and exit.\quad\blackslug\cr}

\ansno 7. All three routines give zero as the answer if
and only if the exact result would be zero, so we need not worry
about zero denominators in the expressions for relative error.
The worst case of the addition routine is pretty bad: Visualized
in decimal notation, if the inputs are 1.0000000 and .99999999,
the answer is $b↑{-7}$ instead of $b↑{-8}$; thus the maximum
relative error $\delta ↓1$ is $b - 1$, where $b$ is the byte
size.

For multiplication and division, we may assume that the
exponents of both operands are \.{QQ}, and that both operands are
positive. The maximum error in multiplication is readily bounded
by considering Fig.\ 4: When $uv ≥ 1/b$, we have $0 ≤ uv - u \otimes
v < 3b↑{-9} + (b - 1)b↑{-9}$, so the relative error is bounded
by $(b + 2)b↑{-8}$. When $1/b↑2 ≤ uv < 1/b$, we have $0 ≤ uv
- u \otimes v < 3b↑{-9}$, so the relative error in this case
is bounded by $3b↑{-9}/uv ≤ 3b↑{-7}$. We take $\delta ↓2$ to
be the larger of the two estimates, namely $3b↑{-7}$.

Division requires a more careful analysis of Program
D\null. The quantity actually computed by the subroutine is $α -
\delta - bε\biglp (α - \delta↑{\prime\prime} )(β - \delta ↑\prime )
- \delta↑{\prime\prime\prime} \bigrp - \delta ↓n$ where $α = (u↓m + εu↓{\,l})/bv↓m$,
$β = v↓{\,l}/bv↓m$, and the nonnegative truncation errors
$(\delta , \delta ↑\prime , \delta↑{\prime\prime},
\delta↑{\prime\prime\prime})$ are respectively less than
$(b↑{-10}, b↑{-5}, b↑{-5}, b↑{-6})$; finally
$\delta ↓n$ (the truncation during
normalization) is nonnegative and less than either
$b↑{-9}$ or $b↑{-8}$, depending
on whether scaling occurs or not. The actual value of the quotient
is $α/(1 + bεβ) = α - bεαβ + b↑2αβ↑2\delta↑{\prime\prime\prime\prime}
$, where $\delta↑{\prime\prime\prime\prime}$ is the nonnegative error due
to truncation of the infinite series (2); here $\delta↑{\prime\prime\prime\prime}
< ε↑2=b↑{-10}$, since it is an alternating series. The relative error
is therefore the absolute value of $(bε\delta ↑\prime + bε\delta↑{\prime\prime}
β/α + bε\delta↑{\prime\prime\prime}/α) - (\delta /α + bε\delta ↑\prime
\delta↑{\prime\prime}/α + b↑2β↑2\delta↑{\prime\prime\prime\prime}+ \delta ↓n/α)$,
times $(1 + bεβ)$. The positive terms in this expression are
bounded by $b↑{-9} + b↑{-8} + b↑{-8}$, and the negative terms
are bounded by $b↑{-8} + b↑{-12} + b↑{-8}$ plus the contribution
by the normalizing phase, which can be about $b↑{-7}$ in magnitude.
It is therefore clear that the potentially greatest part of
the relative error comes during the normalization phase, and
that $\delta ↓3 = (b + 2)b↑{-8}$ is a safe upper bound for
the relative error.

\ansno 8. Addition: If $e↓u ≤ e↓v + 1$, the entire relative
error occurs during the normalization phase, so it is bounded
above by $b↑{-7}$. If $e↓u ≥ e↓v + 2$, and if the signs are
the same, again the entire error may be ascribed to normalization;
if the signs are opposite, the error due to shifting digits
out of the register is in the opposite direction from the subsequent
error introduced during normalization. Both of these errors
are bounded by $b↑{-7}$, hence $\delta ↓1 = b↑{-7}$.\xskip (This is
substantially better then the result in exercise 7.)

Multiplication: An analysis as in exercise 7 gives $\delta↓2=(b+2)b↑{-8}$.
%folio 748 galley 5 (C) Addison-Wesley 1978	*
\ansbegin{4.2.4}

\ansno 1. Since fraction overflow can
occur only when the operands have the same sign, this is the
probability that fraction overflow occurs divided by the probability
that the operands have the same sign, namely, $7\%/\biglp{1\over 2}(91\%)\bigrp
\approx 15\%$.

\ansno 3. $\log↓{10} 2.4 - \log↓{10} 2.3 \approx 1.84834\%$.

\ansno 4. The pages would be uniformly gray (same as ``random
point on a slide rule'').

\ansno 5. The probability that $10f↓U ≤ r$ is $(r - 1)/10 +
(r - 1)/100 + \cdots = (r - 1)/9$. So in this case the leading
digits are {\sl uniformly} distributed; e.g., leading digit
1 occurs with probability ${1\over 9}$.

\ansno 6. The probability that there are three leading zero
bits is $\log↓{16} 2 = {1\over 4}$; the probability that there
are two leading zero bits is $\log↓{16} 4 - \log↓{16} 2 = {1\over
4}$; and similarly for the other two cases. The ``average'' number
of leading zero bits is 1$1\over2$, so the ``average'' number
of ``significant bits'' is $p + {1\over 2}$. The worst case,
$p - 1$ bits, occurs however with rather high probability. In
practice, it is usually necessary to base error estimates on
the worst case, since a chain of calculations is only as strong as its weakest link.
In the error analysis of Section 4.2.2, the
upper bound on relative rounding error for floating-hex is $2↑{1-p}$.
In the binary case we can have $p + 1$ significant bits at all
times (cf.\ exercise 4.2.1--3), with relative rounding errors
bounded by $2↑{-1-p}$. Extensive computational experience confirms
that floating-binary (even with $p$-bit precision instead of
$p + 1$) produces significantly more accurate results than $(p
+ 2)$-bit floating-hex.

Sweeney's tables show that hexadecimal arithmetic
can be done a little faster, since fewer cycles are needed when
scaling to the right or normalizing to the left. But this fact
is insignificant compared to the substantial advantages of $b
= 2$ over other radices (cf.\ also Theorem 4.2.2C and exercises
4.2.2--15, 21), especially since floating-binary can be made
as fast as floating-hex with only a tiny increase in total processor
cost.

\ansno 7. For example, suppose that $\sum ↓m \biglp F(10↑{km}
\cdot 5↑k) - F(10↑{km})\bigrp = \log 5↑k/\log 10↑k$ and also that $\sum
↓m \biglp (F(10↑{km} \cdot 4↑k) - F(10↑{km})\bigrp = \log 4↑k/\log
10↑k$; then
$$\chop to 9pt{\sum ↓{m}\,\biglp F(10↑{km} \cdot 5↑k) - F(10↑{km} \cdot 4↑k)\bigrp
= \log↓{10} {5\over 4}}$$
for all $k$. But now let $ε$ be a small positive
number, and choose $\delta > 0$ so that $F(x) < ε$ for $0 < x
< \delta $, and choose $M > 0$ so that $F(x) > 1-ε$ for $x
> M$. We can take $k$ so large that $10↑{-k} \cdot 5↑k < \delta$
and $4↑k > M$; hence by the monotonicity of $F$,
$$\eqalign{\sum ↓{m}\, \biglp F(10↑{km} \cdot 5↑k) - F(10↑{km} \cdot 4↑k)\bigrp
⊗ ≤ \sum ↓{m≤0}\, \biglp F(10↑{km} \cdot 5↑k) - F(10↑{k(m-1)}
\cdot 5↑k)\bigrp \cr
⊗\qquad\null + \sum ↓{m≥0}\, \biglp F(10↑{k(m+1)}
\cdot 4↑k) - F(10↑{km} \cdot 4↑k)\bigrp \cr
⊗= F(10↑{-k}5↑k) + 1 - F(10↑k4↑k) < 2ε.\cr}$$

\ansno 8. When $s > r$, $P↓0(10↑ns)$ is 1 for small $n$,
and 0 when $\lfloor 10↑ns\rfloor > \lfloor 10↑nr\rfloor $. The
least $n$ for which this happens may be arbitrarily large, so
no uniform bound can be given for $N↓0(ε)$ independent of $s$.\xskip
(In general, calculus textbooks prove that such a uniform bound
would imply that the limit function $S↓0(s)$ would be continuous,
and it isn't.)

\ansno 9. Let $q↓1$, $q↓2$, $\ldots$ be such that $P↓0(n) = q↓1{n-1\choose0}
+ q↓2{n-1\choose1} + \cdots$ for all $n$. It follows that
$P↓m(n) = 1↑{-m}q↓1{n-1\choose0} + 2↑{-m}q↓2{n-1\choose1}
+\cdots$ for all $m$ and $n$.

\ansno 10. When $1 < r < 10$ the generating function $C(z)$
has simple poles at the points $1 + w↓n$, where $w↓n = 2πni/\ln 10$,
hence
$$\chop to 9pt{C(z) = {\log↓{10} r - 1\over 1 - z} + \sum ↓{n≠0}
{1 + w↓n\over w↓n} {e↑{-w↓n\ln r} - 1\over (\ln 10)(z
- 1 - w↓n)} + E(z)}$$
where $E(z)$ is analytic in the entire plane. Thus
if $\theta = \hbox{arctan}(2π/\ln 10)$,
$$\eqalign{c↓m ⊗ =\log↓{10} r - 1 - {2\over \ln 10} \sum ↓{n>0}
\real\bigglp{e↑{-w↓n\ln r} - 1\over w↓n(1 + w↓n)↑m}\biggrp+ e↓m\cr
⊗= \log↓{10} r - 1 + {\sin(m\theta + 2π\log↓{10}
r) -\sin(m\theta )\over π\biglp1 + 4π↑2/(\ln 10)↑2\bigrp ↑{m/2}}
+ O\bigglp{1\over \biglp 1 + 16π↑2/(\ln 10)↑2\bigrp ↑{m/2}}\biggrp.\cr}$$

\ansno 11. When $(\log↓b U) \mod 1$ is uniformly distributed
in $[\,0, 1)$, so is $(\log↓b 1/U)\mod 1 = (1 - \log↓b U)\mod 1$.

\ansno 12. We have $h(z) = \int ↑{z}↓{1/b} f(x)\,dx\,g(z/bx)/bx + \int
↑{1}↓{z} f(x)\,dx\,g(z/x)/x$, and the same holds for $l(z) = \int ↑{z}↓{1/b} f(x)
\,dx\,l(z/bx)/bx + \int ↑{1}↓{z} f(x)\,dx\,l(z/x)/x$, hence $$
{h(z) - l(z)\over l(z)} = \int ↑{z}↓{1/b} f(x)\,dx\,{g(z/bx)
- l(z/bx)\over l(z/bx)} + \int ↑{1}↓{z} f(x)\,dx\,{g(z/x)
- l(z/x)\over l(z/x)}.$$ Since $f(x) ≥ 0$, $\hbox{\:u\char'152}
\biglp h(z) - l(z)\bigrp)/l(z)\hbox{\:u\char'152}
 ≤ \int ↑{z}↓{1/b} f(x)\,dx\,A(g) + \int ↑{1}↓{z} f(x)\,dx\,A(g)$
for all $z$, hence $A(h) ≤ A(g)$. By symmetry, $A(h) ≤ A(f)$.\xskip
[{\sl Bell System Tech.\ J.\ \bf 49} (1970), 1609--1625.]

\ansno 13. Let $X = (\log↓b U)\mod 1$ and $Y = (\log↓b V)\mod
1$, so that $X$ and $Y$ are independently and uniformly distributed
in $[\,0, 1)$. No left shift is needed if and only if $X + Y ≥ 1$,
and this occurs with probability ${1\over 2}$.

(Similarly, the probability is $1\over2$ that floating-point division by
Algorithm 4.2.1M needs no normalization shifts;
this needs only the weaker assumption
that both operands independently have the {\sl same} distribution.)

\ansno 14. For convenience, the calculations are shown here
for $b = 10$. If $k = 0$, the probability of a carry is
$$\hbox to size{$\vcenter{
\hbox to 250pt{$\dispstyle\hfill\left(
1\over\ln 10\right)↑2\int ↓{\scriptstyle1≤x,y≤10\atop\scriptstyle x+y≥10} {dx\over
x}\,{dy\over y}.\hfill$}
\vskip 8pt
\hbox{(See Fig.\ A--7.) The value of the integral is}
\vskip 11pt
\hbox to 250pt{$\dispstyle\hfill\int↓0↑{10}{dy\over y}\int↓{10-y}↑{10}{dx\over x}
-2\int↓0↑1{dy\over y}\int↓{10-y}↑{10}{dx\over x},\hfill$}}\hfill
\vcenter{\vskip33mm\hbox to 32mm{\hfill\caption Fig.\ A--7.\hfill\hskip-10pt}}$}$$
and
$$\int ↑{t}↓{0}{dy\over y}\,\ln\left(1\over 1 - y/10\right)=
\int ↑{t}↓{0}\left({1\over 10} + {y\over 200}+ {y↑2\over
3000} + \cdotss\right)\,dy = {t\over 10} + {t↑2\over 400} +
{t↑3\over 9000} + \cdotss.$$
(The latter integral is essentially a ``dilogarithm.'')
Hence the probability of a carry when $k = 0$ is
$(1/\ln 10)↑2(π↑2/6 - 2 \sum ↓{n≥1} 1/n↑210↑n) = .27154$.\xskip [{\sl Note:}
When $b= 2$ and $k = 0$, fraction overflow {\sl
always} occurs, so this derivation proves that $\sum ↓{n≥1}
1/n↑22↑n = π↑2/12 - {1\over 2}(\ln 2)↑2$.]

When $k > 0$, the probability is
$$\left(1\over\ln 10\right)↑2\int ↑{10↑{1-k}}↓{10↑{-k}}\≤\≤{dy\over
y}\int ↑{10}↓{10-y}{dx\over x} = \left(1\over\ln 10\right)↑2\bigglp\sum
↓{n≥1} {1\over n↑210↑{nk}} - \sum ↓{n≥1} {1\over n↑210↑{n(k+1)}}\biggrp.$$
Thus when $b = 10$, fraction overflow should occur
with probability $.272p↓0 + .017p↓1 + .002p↓2 + \cdotss$. When
$b = 2$ the corresponding figures are $p↓0 + .655p↓1 + .288p↓2
+ .137p↓3 + .067p↓4 + \cdotss$.

Now if we use the probabilities from Sweeney's
first table, dividing by .92 to eliminate zero operands and
assuming that the probabilities are independent of the operand
signs, we predict a probability of about 14 percent when $b
= 10$, instead of the 15 percent in exercise 1. For $b = 2$,
we predict about 48 percent, while the table yields 44 percent.
These results are certainly in agreement within the limits of
experimental error.

\ansno 15. When $k = 0$, the leading digit is 1 if and only
if there is a carry.\xskip (It is possible for fraction overflow and
subsequent rounding to yield a leading digit of 2, when $b ≥
4$, but we are ignoring rounding in this exercise.)\xskip The probability
of fraction overflow is $.272 < \log↓{10} 2$, as shown in the
previous exercise.

When $k > 0$, the leading digit is 1 with probability
$$\left(1\over\ln 10\right)↑2\bigglp\int ↑{10↑{1-k}}↓{10↑{-k}}\≤\≤\≤{dy\over
y}\int ↓{\scriptstyle1≤x<2-y\atop\scriptstyle\hbox{\:eor }10-y≤x<10}
\≤\≤{dx\over x}\biggrp < \left(1\over\ln 10\right)↑2
\bigglp\int ↑{10↑{1-k}}↓{10↑{-k}}\≤\≤\≤{dy\over y}\int ↓{1≤x≤2}\≤{dx\over
x}\biggrp = \log↓{10} 2.$$

\ansno 16. To prove the hint [which is due to Landau,
{\sl Prace matematyczno-fizyczne \bf 21} (1910), 103--113],
assume first that $\limsup a↓n = λ > 0$. Let $ε = λ/(λ + 4M)$
and choose $N$ so that $|a↓1 + \cdots + a↓n| < {1\over 10}ελn$
for all $n > N$. Let $n > N/(1 - ε)$, $n > 5/ε$ be such that $a↓n
> {1\over 2}λ$. Then, by induction, $a↓{n-k} ≥ a↓n - kM/(n - εn)
> {1\over 4}λ$ for $0 ≤ k < εn$, and $\sum ↓{n-εn<k≤n} a↓k ≥
{1\over 4}λ(εn - 1) > {1\over 5}λεn$. But $$\textstyle\left|\sum ↓{n-εn<k≤n}
a↓k\right| = \left|\sum ↓{1≤k≤n} a↓k - \sum ↓{1≤k≤n-εn} a↓k\right| ≤ {1\over
5}ελn$$ since $n - εn > N$. A similar contradiction applies if
$\liminf a↓n < 0$.

Assuming that $P↓{m+1}(n) → λ$ as $n → ∞$, let
$a↓k = P↓m(k) - λ$. If $m > 0$, the $a↓k$ satisfy the hypotheses
of the hint (cf.\ Eq.\ 4.2.2--15), 
since $0 ≤ P↓m(k)≤ 1$; hence $P↓m(n) → λ$.

\ansno 17. See {\sl Fibonacci Quarterly \bf 7} (1969), 474--475.\xskip
$\biglp$Persi Diaconis [Ph.D. thesis, Harvard University, 1974] has
shown among other things that the definition of probability
by repeated averaging is weaker than harmonic probability, in
the following precise sense: If $\lim↓{m→∞}\liminf↓{n→∞}
P↓m(n) = \lim↓{m→∞}\limsup↓{n→∞} P↓m(n) = λ$ then
the harmonic probability is $λ$. On the other hand the statement
``$10↑{k↑2}≤ n < 10↑{k↑2+k}$ for some integer
$k > 0$'' has logarithmic probability ${1\over 2}$, while repeated
averaging never settles down to give it any particular probability.$\bigrp$

\ansno 18. Let $p(a)=P(L↓a)$ and $p(a,b)=\sum↓{a≤k<b}p(k)$ for $1≤a<b$.
Since $L↓a=L↓{10a}∪L↓{10a+1}∪\cdots∪L↓{10a+9}$ for all $a$, we have
$p(a)=p\biglp10a,10(a+1)\bigrp$ by (i). Furthermore since $P(S)=P(2S)+P(2S+1)$
by (i), (ii), (iii), we have $p(a)=p\biglp2a,2(a+1)\bigrp$. It follows that
$p(a,b)=p(2↑m10↑na,2↑m10↑nb)$ for all $m,n≥0$.

If $1<b/a<b↑\prime/a↑\prime$ then $p(a,b)≤p(a↑\prime,b↑\prime)$. The reason is
that there exist integers $m$, $n$, $m↑\prime$, $n↑\prime$ such that
$2↑{m↑\prime}10↑{n↑\prime}a↑\prime≤2↑m10↑na<2↑m10↑nb≤2↑{m↑\prime}10↑{n↑\prime}
b↑\prime$ as a consequence of the fact that $\log2/\log10$ is irrational,
hence we can apply (v).\xskip
(Cf.\ exercise 3.5--22 with $k=1$ and $U↓n=n\log2/\log10$.)\xskip
In particular, $p(a)≥p(a+1)$, and it follows that $p(a,b)/p(a,b+1)≥(b-a)/(b+1-a)
$.\xskip (Cf.\ Eq.\ 4.2.2--15.)

Now we can prove that $p(a,b)=p(a↑\prime,b↑\prime)$ whenever $b/a=b↑\prime/a↑\prime
$; for $p(a,b)=p(10↑na,10↑nb)≤c↓np(10↑na,10↑nb-1)≤c↓np(a↑\prime,b↑\prime)$, for
arbitrarily large values of $n$, where
$c↓n=10↑n(b-a)\hbox{\:a/}\biglp10↑n(b-a)-1\bigrp=1+O(10↑{-n})$.

For any positive integer $n$ we have $p(a↑n,b↑n)=p(a↑n,ba↑{n-1})+p(ba↑{n-1},
b↑2a↑{n-2})+\cdots+p(b↑{n-1}a,b↑n)=np(a,b)$. If $10↑m≤a↑n≤10↑{m+1}$ and
$10↑{m↑\prime}≤b↑n≤10↑{m↑\prime+1}$ we have $p(10↑{m+1},10↑{m↑\prime})≤
p(a↑n,b↑n)≤p(10↑m,10↑{m↑\prime+1})$ by (v). But $p(1,10)=1$ by (iv), hence
$p(10↑m,10↑{m↑\prime})=m↑\prime-m$ for all $m↑\prime≥m$. We conclude that
$\lfloor\log↓{10}b↑n\rfloor-\lfloor\log↓{10}a↑n\rfloor-1≤np(a,b)≤\lfloor\log↓{10}
b↑n\rfloor+\lfloor\log↓{10}a↑n\rfloor+1$ for all $n$, and $p(a,b)=\log↓{10}(b/a)$.

[This exercise was inspired by D. I. A. Cohen, who proved a slightly weaker result
in {\sl J. Combinatorial Theory} (A) {\bf20} (1976), 367--370.]
%folio 754 galley 6 (C) Addison-Wesley 1978	*
\ansbegin{4.3.1}

\def\ansalgstep #1 #2. {\anskip\noindent\hbox to 40pt{\!
\hbox to 19pt{\hskip 0pt plus 1000pt minus 1000pt\bf#1 }\hskip 0pt plus 1000pt
\bf#2. }\hangindent 40pt}
\ansno 2. If the $i$th number to be
added is $u↓i = (u↓{i1}u↓{i2} \ldotsm u↓{in})↓b$, use Algorithm
A with step A2 changed to the following:

\ansalgstep{} A2\char'43. [Add digits.]
Set$$w↓j ← (u↓{1j} + \cdots + u↓{mj} + k)\mod b,\quad\hbox{and}\quad k ←
\lfloor (u↓{1j} + \cdots + u↓{mj} + k)/b\rfloor.$$

\noindent (The maximum
value of $k$ is $m - 1$, so step A3 would have to be altered
if $m > b$.)

\mixansfour 3. {⊗⊗ENT1⊗N⊗ 1\cr
⊗⊗JOV⊗OFLO⊗ 1⊗Ensure overflow is off.\cr
⊗⊗ENTA⊗0⊗ 1⊗$k ← 0$.\cr
\\⊗2H⊗ENT2⊗0⊗ N⊗$(\rI2 ≡ \hbox{next value of }k)$\cr
⊗⊗ENT3⊗M*N-N,1⊗N⊗$\biglp\.{LOC}(u↓{ij}) ≡ \.U + n(i - 1) + j\bigrp$ \cr
\\⊗3H⊗ADD⊗U,3⊗MN⊗$\rA ← \rA + u↓{ij}$.\cr
⊗⊗JNOV⊗*+2⊗MN\cr
⊗⊗INC2⊗1⊗K⊗Carry one.\cr
\\⊗⊗DEC3⊗N⊗MN⊗Repeat for $m ≥ i ≥ 0$.\cr
⊗⊗J3P⊗3B⊗MN⊗$\biglp\rI3 ≡ n(i - 1) + j\bigrp$ \cr
\\⊗⊗STA⊗W,1⊗N⊗$w↓j ←\rA$.\cr
⊗⊗ENTA⊗0,2⊗N⊗$k ← \rI2$.\cr
\\⊗⊗DEC1⊗1⊗ N\cr
⊗⊗J1P⊗2B⊗N⊗Repeat for $n ≥ j ≥ 0$.\cr
⊗⊗STA⊗W⊗ 1⊗Store final carry in $w↓0$.\quad\blackslug\cr}

\yyskip\noindent Running time, assuming that $K = {1\over 2}MN$, is $5.5MN
+ 7N + 4$ cycles.

\ansno 4. We may make the following assertion before
A1: ``$n ≥ 1$; and $0 ≤ u↓i, v↓i < b$ for $1 ≤ i ≤ n$.'' Before
A2, we assert: ``$1 ≤ j ≤ n$; $0 ≤ u↓i, v↓i < b$ for $1 ≤ i ≤
n$; $0 ≤ w↓i < b$ for $j < i ≤ n$; $0 ≤ k ≤ 1$; and
$$(u↓{j+1} \ldotsm u↓n)↓b + (v↓{j+1} \ldotsm v↓n)↓b = (kw↓{j+1}
\ldotsm w↓n)↓b.''$$
The latter statement means more precisely that
$$\chop to 12pt{\sum ↓{j<t≤n} u↓tb↑{n-t} + \sum ↓{j<t≤n} v↓tb↑{n-t} = kb↑{n-j}
+ \sum ↓{j<t≤n} w↓tb↑{n-t}.}$$
Before A3, we assert: ``$1 ≤ j ≤ n$;
$0 ≤ u↓i, v↓i < b$ for $1 ≤ i ≤ n$; $0 ≤ w↓i < b$ for $j ≤ i ≤
n$; $0 ≤ k ≤ 1$; and $(u↓j \ldotsm u↓n)↓b + (v↓j \ldotsm v↓n)↓b
= (kw↓j \ldotsm w↓n)↓b$.'' After A3, we assert that $0 ≤ w↓i
< b$ for $1 ≤ i ≤ n$; $0 ≤ w↓0 ≤ 1$; and $(u↓1 \ldotsm u↓n)↓b +
(v↓1 \ldotsm v↓n)↓b = (w↓0 \ldotsm w↓n)↓b$.

It is a simple matter to complete the proof by
verifying the necessary implications between the assertions
and by showing that the algorithm always terminates.

\ansalgstep 5. B1. Set $j ← 1$, $w↓0 ← 0$.

\ansalgstep{} B2. Set $t ← u↓j + v↓j$, $w↓j ← t \mod b$,
$i ← j$.

\ansalgstep{} B3. If $t ≥ b$, set $i ← i - 1$, $t ← w↓i
+ 1$, $w↓i ← t \mod b$, and repeat this step until $t < b$.

\ansalgstep{} B4. Increase $j$ by one, and if $j ≤ n$ go
back to B2.\quad\blackslug\lower 5pt\null

\ansalgstep 6. C1. Set $j ← 1$, $i ← 0$, $r ← 0$.

\ansalgstep{} C2. Set $t ← u↓j + v↓j$. If $t ≥ b$, set
$w↓i ← r + 1$, $w↓k ← 0$ for $i < k < j$; then set $i ← j$ and $r
← t \mod b$. Otherwise if $t < b - 1$, set $w↓i ← r$, $w↓k ← b
- 1$ for $i < k < j$; then set $i ← j$ and $r ← t$.

\ansalgstep{} C3. Increase $j$ by one. If $j ≤ n$, go
back to C2; otherwise set $w↓i ← r$, and $w↓k ← b - 1$ for $i
< k ≤ n$.\quad\blackslug\lower 5pt\null

\ansno 7. When $j = 3$, for example,
we have $k = 0$ with probability $(b + 1)/2b$; $k = 1$ with probability
$\biglp (b - 1)/2b\bigrp (1 - 1/b)$, namely the probability
that a carry occurs and that the preceding digit wasn't $b - 1$;
$k = 2$ with probability $\biglp (b - 1)/2b\bigrp (1/b)(1 - 1/b)$;
and $k = 3$ with probability $\biglp (b - 1)/2b\bigrp (1/b)(1/b)(1)$.
For fixed $k$ we may add the probabilities as $j$ varies from
1 to $n$; this gives the mean number of times the carry propagates
back $k$ places,
$$m↓k = {b - 1\over 2b↑k}\left( (n + 1 - k)\left(1
- {1\over b}\right) + {1\over b}\right).$$
As a check, we find that the average number of
carries is
$$m↓1 + 2m↓2 + \cdots + nm↓n = {1\over 2} \biggglp
n - {1\over b - 1}\bigglp 1 - \left(1\over b\right)↑n\,\biggrp\bigggrp,$$
in agreement with (6).

\def\mixanseight #1. #2{\annskip\vbox{
\halign{\hbox to 19pt{##}⊗\lft{\tt##}\quad⊗\lft{\tt##}\quad⊗\lft{\tt##}\quad
⊗$\ctr{##}$\hskip40pt⊗\lft{\tt##}\quad⊗\lft{\tt##}\quad⊗\lft{\tt##}\quad⊗$\ctr{##}
\quad$⊗##\hfill\cr
{\hfill\bf #1. }#2}}}
\mixanseight 8. {⊗⊗ENN1⊗N⊗1⊗2H⊗LDA⊗W+N+1,2⊗K\cr
⊗⊗JOV⊗OFLO⊗1⊗⊗INCA⊗1⊗K\cr
⊗⊗STZ⊗W⊗1⊗⊗STA⊗W+N+1,2⊗K\cr
⊗1H⊗LDA⊗U+N+1,1⊗N⊗⊗DEC2⊗1⊗K\cr
⊗⊗ADD⊗V+N+1,1⊗N⊗⊗JOV⊗2B⊗K\cr
⊗⊗STA⊗W+N+1,1⊗N⊗3H⊗INC2⊗1⊗N\cr
⊗⊗JNOV⊗3F⊗N⊗⊗J2N⊗1B⊗N\cr
⊗⊗ENT2⊗-1,1⊗L⊗⊗⊗⊗⊗\blackslug\cr}

\yyskip\noindent The running time depends on $L$, the number of positions
in which $u↓j + v↓j ≥ b$; and on $K$, the total number of carries.
It is not difficult to see that $K$ is the same quantity that appears in
Program A\null. The analysis
in the text shows that $L$ has the average value $N\biglp (b
- 1)/2b\bigrp $, and $K$ has the average value ${1\over 2}(N
- b↑{-1} - b↑{-2} - \cdots - b↑{-n})$. So if we ignore terms
of order $1/b$, the running time is $9N + L + 7K + 3 \approx
13N + 3$ cycles.

{\sl Note:} Since a carry occurs almost
half of the time, it would be more efficient to delay storing
the result by one step. This leads to a somewhat longer program
whose running time is approximately
$12N + 5$ cycles, based on the somewhat more detailed information
calculated in exercise 7.

\ansno 9. Replace ``$b$'' by ``$b↓{n-j}$'' everywhere in step
A2.

\ansno 10. If lines 06 and 07 were interchanged, we would almost
always have overflow, but register A might have a negative value
at line 08, so this would not work. If the instructions on lines
05 and 06 were interchanged, the sequence of overflows occurring
in the program would be slightly different in some cases, but
the program would still be right.

\ansno 11. (a)\9 Set $j
← 1$;\xskip (b) if $u↓j < v↓j$, terminate [$u < v$]; if $u↓j = v↓j$
and $j = n$, terminate [$u = v$]; if $u↓j = v↓j$ and $j < n$,
set $j ← j + 1$ and repeat (b); if $u↓j > v↓j$, terminate
[$u > v$]. This algorithm tends to be quite fast, since there
is usually low probability that $j$ will have to get very high
before we encounter a case with $u↓j ≠ v↓j$.

\ansno 12. Use Algorithm S with $u↓j = 0$ and $v↓j = w↓j$. Another
``borrow'' will occur at the end of the algorithm; this
time it should be ignored.

\mixanseight 13. {⊗⊗ENTX⊗N⊗1⊗⊗ADD⊗CARRY⊗N\cr
⊗⊗JOV⊗OFLO⊗1⊗⊗JNOV⊗*+2⊗N\cr
⊗⊗ENTX⊗0⊗1⊗⊗INCX⊗1⊗K\cr
⊗2H⊗STX⊗CARRY⊗N⊗⊗STA⊗W,1⊗N\cr
⊗⊗LDA⊗U,1⊗N⊗⊗DEC1⊗1⊗N\cr
⊗⊗MUL⊗V⊗N⊗⊗J1P⊗2B⊗N\cr
⊗⊗SLC⊗5⊗N⊗⊗STX⊗W⊗1⊗\blackslug\cr}

\yyskip\noindent The running time is $23N + K + 5$ cycles, and $K$ is
roughly ${1\over 2}N$.

\ansno 14. The key inductive assertion is the one that should
be valid at the beginning of step M4; all others are readily
filled in from this one, which is as follows: ``$1 ≤ i ≤ n$;
$1 ≤ j ≤ m$; $0 ≤ u↓r < b$ for $1 ≤ r ≤ n$; $0 ≤ v↓r < b$ for $1
≤ r ≤ m$; $0 ≤ w↓r < b$ for $j < r ≤ m + n$; $0 ≤ k < b$; and
$$(w↓{j+1} \ldotsm w↓{m+n})↓b + kb↑{m+n-i-j} = u \times (v↓{j+1}
\ldotsm v↓m)↓b + (u↓{i+1} \ldotsm u↓n)↓b \times v↓jb↑{m-j}.''$$
(For the precise meaning of this notation, see the answer to exercise 4.)

\ansno 15. The error is nonnegative and less than $(n - 2)b↑{-n-1}$.
$\biglp$Similarly, if we ignore the products with $i + j > n
+ 3$, the error is bounded by $(n - 3)b↑{-n-2}$, etc.; but,
in some cases, we must compute all of the products if we want
to get the true rounded result.$\bigrp$

\ansalgstep 16. S1. Set $r ← 0$, $j ← 1$.

\ansalgstep{} S2. Set $w↓j ← \lfloor (rb + u↓j)/v\rfloor$,
$r ← (rb + u↓j)\mod v$.

\ansalgstep{} S3. Increase $j$ by 1, and return to S2
if $j ≤ n$.\quad\blackslug\lower 5pt\null

\ansno 17. $u/v > u↓0b↑n/(v↓1 + 1)b↑{n-1} = b\biglp 1 - 1/(v↓1
+ 1)\bigrp > b\biglp 1 - 1/(b/2)\bigrp = b - 2$.

\ansno 18. $(u↓0b + u↓1)/(v↓1 + 1) ≤ u/(v↓1 + 1)b↑{n-1} < u/v$.

\ansno 19. $u - \A qv ≤ u - \A qv↓1b↑{n-1} - \A qv↓2b↑{n-2}
=u↓2b↑{n-2}+\cdots+u↓n+\A rb↑{n-1}-\A q
v↓2b↑{n-2} < b↑{n-2}(u↓2 + 1 + \A rb - \A q
v↓2) ≤ 0$. Since $u - \A qv < 0$, $q < \A q$.

\ansno 20. If $q ≤ \A q - 2$, then $u < (\A q - 1)v
< \A q(v↓1b↑{n-1} + (v↓2 + 1)b↑{n-2}\bigrp - v < \A q
v↓1b↑{n-1} + \A qv↓2b↑{n-2} + b↑{n-1} - v ≤ \A qv↓1b↑{n-1}
+ (b\A r + u↓2)b↑{n-2} + b↑{n-1} - v = u↓0b↑n + u↓1b↑{n-1}
+ u↓2b↑{n-2} + b↑{n-1} - v ≤ u↓0b↑n + u↓1b↑{n-1} + u↓2b↑{n-2}
≤ u$. In other words, $u < u$, and this is a contradiction.

\def\bigsl{\hbox{:a/}}
\ansno 21. (Solution by G. K. Goyal.)\xskip The inequality $v↓2\A q≤b\A r+u↓2$
implies that $\A q≤(u↓0b↑2+u↓1b+u↓2)/(v↓1b+v↓2)≤u\bigsl\biglp(v↓1b+v↓2)b↑{n-2}
\bigrp$. Now $u \mod v =
u-qv = v(1-α)$ where $0≤α=\A q-u/v≤u\biglp 1\bigsl\biglp(v↓1b+v↓2)b↑{n-2}
\bigrp-1/v\bigrp=u(v↓3b↑{n-3}+\cdotss)\bigsl\biglp(v↓1b+v↓2)b↑{n-2}v\bigrp
<u/(v↓1bv)≤\A q/(v↓1b)≤(b-1)/(v↓1b)$, and this is at most $2/b$ since
$v↓1≥{1\over2}(b-1)$.
%folio 758 galley 7a (C) Addison-Wesley 1978	*
\ansno 22. Let $u = 4100$, $v = 588$. We first try
$\A q = \lfloor {41\over 5}\rfloor = 8$, but $8 \cdot 8
> 10(41 - 40) + 0 = 10$. Then we set $\A q = 7$, and now
we find $7 \cdot 8 < 10(41 - 35) + 0 = 60$. But 7 times 588 equals
4116, so the true quotient is $q = 6$.\xskip (Incidentally, this example
shows that Theorem B cannot be improved under the given hypotheses,
when $b = 10$.)

\ansno 23. Obviously $v\lfloor b/(v + 1)\rfloor < (v + 1)\lfloor
b/(v + 1)\rfloor ≤ (v + 1)b/(v + 1) = b$; also if $v ≥ \lfloor
b/2\rfloor$ we obviously have $v\lfloor b/(v + 1)\rfloor ≥ v
≥ \lfloor b/2\rfloor $. Finally, assume that $1 ≤ v < \lfloor
b/2\rfloor $. Then $v\lfloor b/(v + 1)\rfloor > v\biglp b/(v
+ 1) - 1\bigrp ≥ b/2 - 1 ≥ \lfloor b/2\rfloor - 1$, be\-cause
$v\biglp b/(v + 1) - 1\bigrp - (b/2 - 1) = (b/2 - v - 1)(v - 1)/(v
+ 1) ≥ 0$. Finally since $v\lfloor b/(v + 1)\rfloor > \lfloor
b/2\rfloor - 1$, we must have $v\lfloor b/(v + 1)\rfloor ≥ \lfloor
b/2\rfloor $.

\ansno 24. The probability is only $\log↓b 2$, not $1\over 2$.\xskip
(For example, if $b = 2↑{35}$, the probability is approximately
${1\over 35}$; this is still high enough to warrant the special
test for $d = 1$ in steps D1 and D8.)

\def\mixansfive #1. #2{\def\\{\noalign{\penalty-200}}\annskip
\halign{\hbox to 19pt{##}⊗\rt{ \it##}\quad⊗\lft{\tt##}\quad⊗\lft{\tt##}\quad
⊗\lft{\tt##}\quad⊗$\ctr{##}$⊗\quad\lft{\rm##}\cr
{\hfill\bf #1. }#2}}
\mixansfive 25. {⊗002⊗⊗ENTA⊗1⊗1\cr
⊗003⊗⊗ADD⊗V+1⊗1\cr
⊗004⊗⊗STA⊗TEMP⊗1\cr
\\⊗005⊗⊗ENTA⊗1⊗1\cr
⊗006⊗⊗JOV⊗1F⊗1⊗Jump if $v↓1 = b - 1$.\cr
\\⊗007⊗⊗ENTX⊗0⊗1\cr
⊗008⊗⊗DIV⊗TEMP⊗1⊗Otherwise compute $b/(v↓1+ 1)$.\cr
⊗009⊗⊗JOV⊗DIVBYZERO⊗1⊗Jump if $v↓1 = 0$.\cr
\\⊗010⊗1H⊗STA⊗D⊗1\cr
⊗011⊗⊗DECA⊗1⊗1\cr
\\⊗012⊗⊗JANZ⊗*+3⊗1⊗Jump if $d ≠ 1$.\cr
⊗013⊗⊗STZ⊗U⊗1 - A\cr
⊗014⊗⊗JMP⊗D2⊗1 - A\cr
\\⊗015⊗⊗ENT1⊗N⊗A⊗Multiply $v$ by $d$.\cr
⊗016⊗⊗ENTX⊗0⊗A\cr
\\⊗017⊗2H⊗STX⊗CARRY⊗AN\cr
⊗018⊗⊗LDA⊗V,1⊗AN\cr
⊗019⊗⊗MUL⊗D⊗AN\cr
⊗$\cdots$⊗⊗⊗⊗⊗(as in exercise 13)\cr
⊗026⊗⊗J1P⊗2B⊗AN\cr
\\⊗027⊗⊗ENT1⊗M+N⊗A⊗(Now $\rX = 0$.)\cr
⊗028⊗2H⊗STX⊗CARRY⊗A(M + N)⊗Multiply $u$ by $d$.\cr
⊗029⊗⊗LDA⊗U,1⊗A(M + N)\cr
⊗$\cdots$⊗⊗⊗⊗⊗(as in exercise 13)\cr
⊗037⊗⊗J1P⊗2B⊗A(M + N)\cr
⊗038⊗⊗STX⊗U⊗A⊗\quad\blackslug\lower5pt\null\cr}

\ansno 26. (See the algorithm of exercise 16.)

{\yyskip\tabskip 22pt\mixfive{\!
101⊗D8⊗LDA⊗D⊗1⊗(Remainder will be left in loca-\cr
102⊗⊗DECA⊗1⊗1⊗\qquad tions \.{U+M+1} through \.{U+M+N})\cr
103⊗⊗JAZ⊗DONE⊗1⊗Terminate if $d = 1$.\cr
\\104⊗⊗ENN1⊗N⊗A⊗$\rI1 ≡ j - n - 1$; $j ← 1$.\cr
105⊗⊗ENTA⊗0⊗A⊗$r ← 0$.\cr
\\106⊗1H⊗LDX⊗U+M+N+1,1⊗AN⊗$\rAX ← rb+u↓{m+j}$.\cr
107⊗⊗DIV⊗D⊗AN\cr
108⊗⊗STA⊗U+M+N+1,1⊗AN\cr
109⊗⊗SLAX⊗5⊗AN⊗$r ← (rb + u↓{m+j}\mod d$.\cr
110⊗⊗INC2⊗1⊗AN⊗$j ← j + 1$.\cr
111⊗⊗J2N⊗1B⊗AN⊗Repeat for $1 ≤ j ≤ n$.\quad\blackslug\cr}}

\yyskip\noindent At this point, the division routine is complete; and
by the next exercise, register AX is zero.

\ansno 27. It is $du \mod dv = d(u \mod  v)$.

\ansno 28. For convenience, let us assume that $v$ has a decimal
point at the left, i.e., $v = (v↓0.v↓1v↓2\ldotsm)↓b$. After
step N1 we have ${1\over 2} ≤ v < 1 + 1/b$: for
$$\eqalignno{v\,\left\lfloor b + 1\over v↓1 + 1\right\rfloor⊗≤{v(b + 1)\over v↓1
+ 1} = {v(1 + 1/b)\over (1/b)(v↓1 + 1)} < 1 + {1\over b},\cr
\noalign{\hbox{and}}
v\,\left\lfloor b + 1\over v↓1 + 1\right\rfloor ⊗≥ {v(b+1 - v↓1)\over v↓1
+ 1} ≥ {1\over b} {v↓1(b + 1 - v↓1)\over v↓1 + 1}.\cr}$$
The latter quantity takes its smallest value when $v↓1
= 1$, since it is a convex function and the other extreme value
is greater.

The formula in step N2 may be written
$$v ← \left\lfloor b(b + 1)\over v↓1 + 1\right\rfloor{v\over b},$$
so we see as above that $v$ will never become $≥1 + 1/b$.

The minimum value of $v$ after one iteration of
step N2 is $≥$
$$\eqalign{\left(b(b + 1) - v↓1\over v↓1 + 1\right){v\over
b} ≥ \left(b(b + 1) - v↓1\over v↓1 + 1\right){v↓1\over b↑2}⊗
= \left(b(b + 1) + 1 - t\over t\right)\left(t - 1\over
b↑2\right)\cr
\noalign{\vskip3pt}
⊗= 1 + {1\over b}+{2\over b↑2} - {1\over
b↑2}\left(t + {b(b + 1) + 1\over t}\right),\cr}$$
if $t = v↓1 + 1$. The minimum of this quantity occurs
for $t = b/2 + 1$; a lower bound is $1 - 3/2b$. Hence $v↓1 ≥
b - 2$, after one iteration of step N2. Finally, we have $(1
- 3/2b)(1+1/b)↑2 > 1$, when $b ≥ 5$, so at most two more
iterations are needed. The assertion is easily verified when
$b < 5$.

\ansno 29. True, since $(u↓j \ldotsm u↓{j+n})↓b < v$.

\ansno 30. In Algorithms A and S\null, such overlap is possible if
the algorithms are rewritten slightly; e.g., in Algorithm A\null,
we could rewrite step A2 thus: ``Set $t ← u↓j + v↓j + k$, $w↓j
← t \mod b$, $k ← \lfloor t/b\rfloor $.''

In Algorithm M\null, $v↓j$ may be in the same location
as $w↓j$. In Algorithm D\null, it is most convenient (as in Program
D\null, exercise 26) to let $r↓1 \ldotsm r↓n$ be the same as $u↓{m+1}
\ldotsm u↓{m+n}$; and we can also have $q↓0 \ldotsm q↓m$ the same
as $u↓0 \ldotsm u↓m$, provided that no alteration of $u↓j$ is
made in step D6.\xskip (Line 098 of Program D can safely be changed to ``\.{J1P
2B}'', since $u↓j$ isn't used in the subsequent calculation.)

\ansno 31. Consider the situation
of Fig.\ 6 with $u = (u↓ju↓{j+1} \ldotsm u↓{j+n})↓3$ as in Algorithm
D\null. If the leading nonzero digits of $u$ and $v$ have the same
sign, set $r ← u - v$, $q ← 1$; otherwise set $r ← u + v$, $q ←
-1$. Now if $|r| > |u|$, or if $|r| = |u|$ and the first nonzero
digit of $u↓{j+n+1} \ldotsm u↓{m+n}$ has the same sign as the
first nonzero digit of $r$, set $q ← 0$; otherwise set $u↓j
\ldotsm u↓{j+n}$ equal to the digits of $r$.

\ansno 36. Values to 1000 decimal and 1100 octal places have
been computed by R. P. Brent, Comp.\ Centre Tech.\ Rep.\ 47 (Canberra: Australian
Nat.\ Univ., 1975).

\ansno 37. Let $d=2↑e$ so that $b>dv↓1≥b/2$. Instead of normalizing $u$ and $v$ in
step D1, simply compute the two leading digits $v↓1↑\prime v↓2↑\prime$ of
$2↑e(v↓1v↓2v↓3)↓b$ by shifting left $e$ bits. In step D3, use $(v↓1↑\prime,
v↓2↑\prime)$ instead of $(v↓1,v↓2)$ and $(u↓j↑\prime,u↓{j+1}↑\prime,
u↓{j+2}↑\prime)$ instead of $(u↓j,u↓{j+1},u↓{j+2})$, where the digits
$u↓j↑\prime u↓{j+1}↑\prime u↓{j+2}↑\prime$ are obtained from $(u↓j
u↓{j+1}u↓{j+2}u↓{j+3})↓b$ by shifting left $e$ bits. Omit division by $d$ in
step D8.\xskip (In essence, $u$ and $v$ are being ``virtually'' shifted.
This method saves computation when $m$ is small
compared to $n$.)
\eject % assumes that page break at this point works OK, so file v2ans1 not too long